[英]flume syslog agent not picking the message and placing it into HDFS
I am trying to simulate syslog flume agent which eventually should put the data into HDFS. 我正在尝试模拟syslog flume代理,该代理最终应将数据放入HDFS。
My scenario follows: 我的情况如下:
the syslog flume agent is running on physical server A, following are the configuration details: syslog flume代理正在物理服务器A上运行,以下是配置详细信息:
=== ===
syslog_agent.sources = syslog_source
syslog_agent.channels = MemChannel
syslog_agent.sinks = HDFS
# Describing/Configuring the source
syslog_agent.sources.syslog_source.type = syslogudp
#syslog_agent.sources.syslog_source.bind = 0.0.0.0
syslog_agent.sources.syslog_source.bind = localhost
syslog_agent.sources.syslog_source.port = 514
# Describing/Configuring the sink
syslog_agent.sinks.HDFS.type=hdfs
syslog_agent.sinks.HDFS.hdfs.path=hdfs://<IP_ADD_OF_NN>:8020/user/ec2-user/syslog
syslog_agent.sinks.HDFS.hdfs.fileType=DataStream
syslog_agent.sinks.HDFS.hdfs.writeformat=Text
syslog_agent.sinks.HDFS.hdfs.batchSize=1000
syslog_agent.sinks.HDFS.hdfs.rollSize=0
syslog_agent.sinks.HDFS.hdfs.rollCount=10000
syslog_agent.sinks.HDFS.hdfs.rollInterval=600
# Describing/Configuring the channel
syslog_agent.channels.MemChannel.type=memory
syslog_agent.channels.MemChannel.capacity=10000
syslog_agent.channels.MemChannel.transactionCapacity=1000
#Bind sources and sinks to the channel
syslog_agent.sources.syslog_source.channels = MemChannel
syslog_agent.sinks.HDFS.channel = MemChannel
sudo logger --server < IP_Address_physical_server_A > --port 514 --udp sudo记录器--server < IP_Address_physical_server_A > --port 514 --udp
I do see yje log messages going into physical server-A 's path --> /var/log/messages
我确实看到yje日志消息进入物理服务器A的路径-> /var/log/messages
But I don't see any message going into HDFS; 但是我没有看到任何消息进入HDFS。 it seems the the flume agent isn't able to get any data, even though the messages are going from server-B to server-A. 即使消息从服务器B发送到服务器A,似乎水槽代理也无法获取任何数据。
Am I doing something wrong here? 我在这里做错什么了吗? Can anyone help me how to resolve this? 谁能帮我解决这个问题?
EDIT 编辑
The following is the output of netstat command on server-A where the syslog daemon is running: 以下是运行syslog守护程序的服务器-A上netstat命令的输出:
tcp 0 0 0.0.0.0:514 0.0.0.0:* LISTEN 573/rsyslogd
tcp6 0 0 :::514 :::* LISTEN 573/rsyslogd
udp 0 0 0.0.0.0:514 0.0.0.0:* 573/rsyslogd
udp6 0 0 :::514 :::* 573/rsyslogd
I'm not sure what logger --server
.gives you, but most examples I have seen use netcat. 我不确定是什么logger --server
为您提供了帮助,但是我看到的大多数示例都使用netcat。
In any case, you've set batchSize=1000
, so until you send 1000 messages, Flume will not write to HDFS. 无论如何,您batchSize=1000
将batchSize=1000
设置batchSize=1000
,因此在发送1000条消息之前,Flume不会写入HDFS。
Keep in mind, HDFS is not a streaming platform, and prefers not to have small files. 请记住,HDFS不是流平台,并且不希望文件很小。
If you're looking for log collection, look into Elasticsearch or Solr fronted by a Kafka topic 如果您正在寻找日志收集,请查看Kafka主题前的Elasticsearch或Solr
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.