繁体   English   中英

Apache Flume + HDFS水槽

[英]Apache Flume + Hdfs Sink

我们可以为HDFS Sink添加定界符吗? 什么时候写入文件,我们如何添加记录分隔符?

以下是配置:-

 tier1.sinks.hdfssink.type = hdfs
 tier1.sinks.hdfssink.channel = memory
 tier1.sinks.hdfssink.hdfs.path=tmp/kafka/%{topic}/%y-%m-%d
 tier1.sinks.hdfssink.hdfs.rollSize=268435456
 tier1.sinks.hdfssink.hdfs.rollCount=0
 tier1.sinks.hdfssink.hdfs.rollInterval = 0
 tier1.sinks.hdfssink.hdfs.useLocalTimeStamp=true
 tier1.sinks.hdfssink.hdfs.fileType=DataStream
 tier1.sinks.hdfssink.hdfs.inUseSuffix=.tmp
 tier1.sinks.hdfssink.hdfs.batchSize=10000

我倾向于使用Flume EventSerializer,其配置与此类似:

tier1.sinks.hdfssink.serializer = <your serialization class>
tier1.sinks.hdfssink.serializer.delimiter = < your delimiter>

您可以参考以下github网站以获取详细信息和代码段。

https://github.com/relistan/flume-serializers

希望有帮助!

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM