簡體   English   中英

從kafka寫入HDFS:createBlockOutputStream異常

[英]HDFS write from kafka : createBlockOutputStream Exception

我正在從具有1個namenode和3個datanode(在3台物理計算機上)的docker swarm中使用Hadoop。 我還使用kafka和kafka connect + hdfs連接器以拼花格式將消息寫入HDFS。

我能夠使用HDFS客戶端(放入HDFS)將數據寫入HDFS。 但是當kafka正在編寫消息時,它從一開始就起作用,然后如果失敗並出現以下錯誤:

org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.0.0.8:50010]
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:534)
    at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1533)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1309)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1262)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:448)
[2018-05-23 10:30:10,125] INFO Abandoning BP-468254989-172.17.0.2-1527063205150:blk_1073741825_1001 (org.apache.hadoop.hdfs.DFSClient:1265)
[2018-05-23 10:30:10,148] INFO Excluding datanode DatanodeInfoWithStorage[10.0.0.8:50010,DS-cd1c0b17-bebb-4379-a5e8-5de7ff7a7064,DISK] (org.apache.hadoop.hdfs.DFSClient:1269)
[2018-05-23 10:31:10,203] INFO Exception in createBlockOutputStream (org.apache.hadoop.hdfs.DFSClient:1368)
org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.0.0.9:50010]
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:534)
        at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1533)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1309)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1262)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:448)

然后,該過程的數據節點不再可用:

[2018-05-23 10:32:10,316] WARN DataStreamer Exception (org.apache.hadoop.hdfs.DFSClient:557)
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /topics/+tmp/test_hdfs/year=2018/month=05/day=23/hour=08/60e75c4c-9129-454f-aa87-6c3461b54445_tmp.parquet could only be replicated to 0 nodes instead of minReplication (=1).  There are 3 datanode(s) running and 3 node(s) are excluded in this operation.
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1733)
        at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:265)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2496)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:828)

但是,如果我查看hadoop Web管理控制台,則所有節點似乎都正常。

我已經檢查了hdfs站點,並且namenode和datanode上的“ dfs.client.use.datanode.hostname”設置都設置為true。 hadoop配置文件中的所有ip均使用0.0.0.0地址定義。

我也嘗試格式化namenode,但是錯誤再次發生。

問題可能出在Kafka在HDFS中寫得太快,以至於淹沒了它嗎? 這很奇怪,因為我在較小的群集上嘗試了相同的配置,即使在吞吐量很高的kafka消息下也能正常工作。

您對這個問題的根源還有其他想法嗎?

謝謝

dfs.client.use.datanode.hostname=true還必須配置到客戶端,並遵循日志堆棧:

java.nio.channels.SocketChannel [連接待處理的遠程= / 10.0.0.9:50010]

我猜10.0.0.9是指私有網絡IP; 因此,似乎未在客戶端的hdfs-client.xml中設置該屬性。

您可以在此處找到更多詳細信息

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM