简体   繁体   English

使用Flume将推文写入HDFS不适用于Agent

[英]Writing Tweets to the HDFS using Flume doesn't work for Agent

This is my Twitter.conf : 这是我的Twitter.conf:

TwitterAgent.sources = Twitter
TwitterAgent.channels = MemChannel
TwitterAgent.sinks = HDFS


TwitterAgent.sources.Twitter.type = org.apache.flume.source.twitter.TwitterSource
TwitterAgent.sources.Twitter.channels = MemChannel
TwitterAgent.sources.Twitter.consumerKey = <<API key>>
TwitterAgent.sources.Twitter.consumerSecret = <<API secret>>
TwitterAgent.sources.Twitter.accessToken = <<Access token>>
TwitterAgent.sources.Twitter.accessTokenSecret = <<Access token secret>>
TwitterAgent.sources.Twitter.keywords = hadoop, big data, analytics

TwitterAgent.sinks.HDFS.channel = MemChannel
TwitterAgent.sinks.HDFS.type = hdfs
TwitterAgent.sinks.HDFS.hdfs.path = hdfs://Singh:9000/flume/
TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream
TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text
TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000
TwitterAgent.sinks.HDFS.hdfs.rollSize = 0
TwitterAgent.sinks.HDFS.hdfs.rollCount = 10000

TwitterAgent.channels.MemChannel.type = memory
TwitterAgent.channels.MemChannel.capacity = 10000
TwitterAgent.channels.MemChannel.transactionCapacity = 100

This is my flume.log : 这是我的flume.log:

20 Sep 2014 14:22:02,286 INFO  [lifecycleSupervisor-1-0] (org.apache.flume.node.PollingPropertiesFileConfigurationProvider.start:61)  - Configuration provider starting
20 Sep 2014 14:22:02,297 INFO  [conf-file-poller-0] (org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:133)  - Reloading configuration file:/home/vijay/BigData/flume-1.5.0.1/conf/twitter.conf
20 Sep 2014 14:22:02,307 INFO  [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016)  - Processing:HDFS
20 Sep 2014 14:22:02,308 INFO  [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016)  - Processing:HDFS
20 Sep 2014 14:22:02,308 INFO  [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016)  - Processing:HDFS
20 Sep 2014 14:22:02,309 INFO  [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:930)  - Added sinks: HDFS Agent: TwitterAgent
20 Sep 2014 14:22:02,309 INFO  [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016)  - Processing:HDFS
20 Sep 2014 14:22:02,309 INFO  [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016)  - Processing:HDFS
20 Sep 2014 14:22:02,309 INFO  [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016)  - Processing:HDFS
20 Sep 2014 14:22:02,309 INFO  [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016)  - Processing:HDFS
20 Sep 2014 14:22:02,309 INFO  [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016)  - Processing:HDFS
20 Sep 2014 14:22:02,322 INFO  [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration.validateConfiguration:140)  - Post-validation flume configuration contains configuration for agents: [TwitterAgent]
20 Sep 2014 14:22:02,322 WARN  [conf-file-poller-0] (org.apache.flume.node.AbstractConfigurationProvider.getConfiguration:138)  - No configuration found for this host:agent
20 Sep 2014 14:22:02,330 INFO  [conf-file-poller-0] (org.apache.flume.node.Application.startAllComponents:138)  - Starting new configuration:{ sourceRunners:{} sinkRunners:{} channels:{} }

Could you please help me out for this? 你能帮我吗?

Try this flume configuration : 试试这个水槽配置:

TwitterAgent.sources = Twitter
TwitterAgent.channels = MemChannel
TwitterAgent.sinks = HDFS


TwitterAgent.sources.Twitter.type = com.cloudera.flume.source.TwitterSource
TwitterAgent.sources.Twitter.channels = MemChannel
TwitterAgent.sources.Twitter.consumerKey = <<API key>>
TwitterAgent.sources.Twitter.consumerSecret = <<API secret>>
TwitterAgent.sources.Twitter.accessToken = <<Access token>>
TwitterAgent.sources.Twitter.accessTokenSecret = <<Access token secret>>


TwitterAgent.sources.Twitter.keywords = hadoop, big data, analytics


TwitterAgent.sinks.HDFS.channel = MemChannel
TwitterAgent.sinks.HDFS.type = hdfs
TwitterAgent.sinks.HDFS.hdfs.path = hdfs://localhost:54310/user/bigdata/tweets
TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream
TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text
TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000
TwitterAgent.sinks.HDFS.hdfs.rollSize = 0
TwitterAgent.sinks.HDFS.hdfs.rollCount = 10000


TwitterAgent.channels.MemChannel.type = memory
TwitterAgent.channels.MemChannel.capacity = 10000
TwitterAgent.channels.MemChannel.transactionCapacity = 10000

You need to add http://files.cloudera.com/samples/flume-sources-1.0-SNAPSHOT.jar this jar into flume/lib folder 您需要将此http://files.cloudera.com/samples/flume-sources-1.0-SNAPSHOT.jar添加到flume / lib文件夹中

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM