简体   繁体   中英

How to configure apache flume for Facebook data Streaming

Could you please provide me the steps to configure Flume for Facebook Data streaming .

We have successfully configured flume for extracting data from Twitter .

Please have a look in the configuration that we have created for extracting data from twitter .

Flume configuration for Twitter

TwitterAgent.sources = Twitter 
TwitterAgent.channels = MemChannel 
TwitterAgent.sinks = HDFS

# Describing/Configuring the source 
TwitterAgent.sources.Twitter.type =          org.apache.flume.source.twitter.TwitterSource
TwitterAgent.sources.Twitter.consumerKey =  4ENqf3q23iwdTSDJchv7w 
TwitterAgent.sources.Twitter.consumerSecret =      bAPTWfbRildBMWsEHo56SmZeXkftvZNCgvjHXbcUfAKoKzQjY0VIUOftTh6c 
TwitterAgent.sources.Twitter.accessToken = 736128293661855746-   rQIQYZNGCh9lW8XHCkjcnvwZH1BItnGi0XJ0gHM26F
TwitterAgent.sources.Twitter.accessTokenSecret = ehTsqX7GcU1aBqmekDcwPuu1csFOnfgzxc2EPtS0kudXOADeAAI 
TwitterAgent.sources.Twitter.keywords = modi, india elections, bjp,   congress, tdp,jana sena, pwan kalyan, mohanlal

# Describing/Configuring the sink 

TwitterAgent.sinks.HDFS.type = hdfs 
TwitterAgent.sinks.HDFS.hdfs.path = hdfs://env11-hadoop-  master.trv.flytxt.com:54310/user/Hadoop/twitter_data
TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream
TwitterAgent.sinks.HDFS.hdfs.writeFormat = writable 
TwitterAgent.sinks.HDFS.hdfs.batchSize = 1
TwitterAgent.sinks.HDFS.hdfs.rollSize = 1
TwitterAgent.sinks.HDFS.hdfs.rollCount = 1

# Describing/Configuring the channel TwitterAgent.channels.MemChannel.type =   memory 
TwitterAgent.channels.MemChannel.capacity = 10000 
TwitterAgent.channels.MemChannel.transactionCapacity = 100
TwitterAgent.channels.MemChannel.type = file

# Binding the source and sink to the channel 
TwitterAgent.sources.Twitter.channels = MemChannel

TwitterAgent.sinks.HDFS.channel = MemChannel

For configuring twitter we need the following configurations

TwitterAgent.sources.Twitter.consumerKey =   4ENqf3q23iwsdfmhadfjafjkemliSYs7w 
TwitterAgent.sources.Twitter.consumerSecret =   bAPTWfbRildangxvasxvhaxjasbxkjtvUfAKoKzQjY0VIUOftTh6c 
TwitterAgent.sources.Twitter.accessToken = 7361282936618557ZNbcvHJxjxbnH1BItnGi0XJ0gHM26F
TwitterAgent.sources.Twitter.accessTokenSecret = ehTsASNMGCxvashgvcxjAHvcSFGcjahgPuu1csFO2EPtS0kudXOADeAAI 

But how to obtain the same for facebook ? or kindly provide a working configuration for Facebook Data streaming using Flume

Yes, you can certainly receive data from Facebook in form of logs with the help of a tool known Facebook's scribe.

For the installation part of scribe you can refer to: http://blog.octo.com/en/scribe-installation/

and for the working part you can refer to : http://blog.octo.com/en/scribe-a-way-to-aggregate-data-and-why-not-to-directly-fill-the-hdfs/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM