簡體   English   中英

來自Flume的Kafka的EOFException

[英]EOFException from Kafka in Flume

我正在嘗試建立從控制台Kafka生產者到Hadoop文件系統(HDFS)的簡單數據管道。 我正在使用64位Ubuntu虛擬機,並按照我所遵循的指南的建議為Hadoop和Kafka創建了單獨的用戶。 通過控制台用戶使用Kafka產生的輸入,並且HDFS似乎已啟動並運行。

現在,我想使用Flume將輸入通過管道傳輸到HDFS。 我正在使用以下配置文件:

tier1.sources  = source1
tier1.channels = channel1
tier1.sinks = sink1

tier1.sources.source1.type = org.apache.flume.source.kafka.KafkaSource
tier1.sources.source1.zookeeperConnect = 127.0.0.1:2181
tier1.sources.source1.topic = test
tier1.sources.source1.groupId = flume
tier1.sources.source1.channels = channel1
tier1.sources.source1.interceptors = i1
tier1.sources.source1.interceptors.i1.type = timestamp
tier1.sources.source1.kafka.consumer.timeout.ms = 2000

tier1.channels.channel1.type = memory
tier1.channels.channel1.capacity = 10000
tier1.channels.channel1.transactionCapacity = 1000

tier1.sinks.sink1.type = hdfs
tier1.sinks.sink1.hdfs.path = hdfs://flume/kafka/%{topic}/%y-%m-%d
tier1.sinks.sink1.hdfs.rollInterval = 5
tier1.sinks.sink1.hdfs.rollSize = 0
tier1.sinks.sink1.hdfs.rollCount = 0
tier1.sinks.sink1.hdfs.fileType = DataStream
tier1.sinks.sink1.channel = channel1

現在,當我使用以下命令運行Flume時

bin/flume-ng agent --conf ./conf -f conf/flume.conf -Dflume.root.logger=DEBUG,console -n tier1

我一遍又一遍地在控制台輸出中得到相同的異常:

2017-10-19 12:17:04,279 (lifecycleSupervisor-1-2) [DEBUG - org.apache.kafka.clients.NetworkClient.handleConnections(NetworkClient.java:467)] Completed connection to node 2147483647
2017-10-19 12:17:04,279 (lifecycleSupervisor-1-2) [DEBUG - org.apache.kafka.common.network.Selector.poll(Selector.java:307)] Connection with Ubuntu-Sandbox/127.0.1.1 disconnected
java.io.EOFException
    at org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:83)
    at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:71)
    at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:153)
    at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:134)
    at org.apache.kafka.common.network.Selector.poll(Selector.java:286)
    at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:256)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:320)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:213)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:193)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:163)
    at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:222)
    at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.ensurePartitionAssignment(ConsumerCoordinator.java:311)
    at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:890)
    at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:853)
    at org.apache.flume.source.kafka.KafkaSource.doStart(KafkaSource.java:529)
    at org.apache.flume.source.BasicSourceSemantics.start(BasicSourceSemantics.java:83)
    at org.apache.flume.source.PollableSourceRunner.start(PollableSourceRunner.java:71)
    at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:249)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

停止Flume的唯一方法是殺死Java進程。

我以為這可能與Hadoop和Kafka的單獨用戶有關,但是即使使用Kafka用戶運行所有內容,我也會得到相同的結果。 我也沒有在線找到有關EOFException方法的任何信息,考慮到我剛剛遵循了“入門”指南並對所有內容使用了非常標准的配置,因此這很奇怪。

也許與上一行(“ Ubuntu-Sandbox / 127.0.1.1已斷開連接”)以及我的VM的配置有關?

任何幫助深表感謝!

您是否考慮過改用Kafka Connect(Apache Kafka的一部分)和HDFS連接器 通常認為這已取代Flume。 它易於使用,並且具有與Flume類似的基於文件的配置。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM