简体   繁体   English

只能写入 1 个 minReplication 节点中的 0 个。 有 3 个数据节点正在运行,3 个节点被排除在该操作中。 使用 docker

[英]Could only be written to 0 of the 1 minReplication nodes. There are 3 datanode(s) running and 3 node(s) are excluded in this operation. Using docker

I made a docker-compose.yml file that includes multiple apache services such as Hadoop, Kafka and Flume.我制作了一个docker-compose.yml文件,里面包含Hadoop、Kafka、Flume等多个apache服务。 Currently, I try to retrieve data with Kafka, send it to Flume (to be able to transform the data(structure), and store it inside the HDFS. I generate dummy data by using a Kafka producer where I can send messages to the Kafka broker. Flume listens to a certain topic, transforms and defines the location of the data, and tries to send it to the HDFS. Whenever the flume agent notices data is getting in, the following error occurs:目前,我尝试使用 Kafka 检索数据,将其发送到 Flume(以便能够转换数据(结构),并将其存储在 HDFS 中。我使用 Kafka 生产者生成虚拟数据,我可以在其中向 Kafka 发送消息broker.flume 监听某个主题,转换并定义数据的位置,并尝试将其发送到 HDFS。每当 flume agent 注意到数据正在进入时,就会出现以下错误:

2021-11-14 20:16:13,554 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2021-11-14 20:16:17,448 WARN hdfs.DataStreamer: Exception in createBlockOutputStream blk_1073742188_1365
java.net.ConnectException: Connection refused
    at java.base/sun.nio.ch.Net.pollConnect(Native Method)
    at java.base/sun.nio.ch.Net.pollConnectNow(Net.java:672)
    at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:946)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:586)
    at org.apache.hadoop.hdfs.DataStreamer.createSocketForPipeline(DataStreamer.java:253)
    at org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1757)
    at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1711)
    at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:707)
2021-11-14 20:16:17,451 WARN hdfs.DataStreamer: Abandoning BP-2051009381-192.168.160.8-1635954925420:blk_1073742188_1365
2021-11-14 20:16:17,462 WARN hdfs.DataStreamer: Excluding datanode DatanodeInfoWithStorage[192.168.192.12:50010,DS-0eb49c38-45e0-46bb-be71-23f07b5ac9dc,DISK]
2021-11-14 20:16:28,525 WARN hdfs.DataStreamer: Exception in createBlockOutputStream blk_1073742189_1366
java.net.ConnectException: Connection refused
    at java.base/sun.nio.ch.Net.pollConnect(Native Method)
    at java.base/sun.nio.ch.Net.pollConnectNow(Net.java:672)
    at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:946)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:586)
    at org.apache.hadoop.hdfs.DataStreamer.createSocketForPipeline(DataStreamer.java:253)
    at org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1757)
    at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1711)
    at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:707)
2021-11-14 20:16:28,525 WARN hdfs.DataStreamer: Abandoning BP-2051009381-192.168.160.8-1635954925420:blk_1073742189_1366
2021-11-14 20:16:28,533 WARN hdfs.DataStreamer: Excluding datanode DatanodeInfoWithStorage[192.168.192.10:50010,DS-829fd615-4b31-4379-874a-ad06769d138e,DISK]
2021-11-14 20:16:29,557 WARN hdfs.DataStreamer: Exception in createBlockOutputStream blk_1073742190_1367
java.net.ConnectException: Connection refused
    at java.base/sun.nio.ch.Net.pollConnect(Native Method)
    at java.base/sun.nio.ch.Net.pollConnectNow(Net.java:672)
    at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:946)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:586)
    at org.apache.hadoop.hdfs.DataStreamer.createSocketForPipeline(DataStreamer.java:253)
    at org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1757)
    at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1711)
    at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:707)
2021-11-14 20:16:29,557 WARN hdfs.DataStreamer: Abandoning BP-2051009381-192.168.160.8-1635954925420:blk_1073742190_1367
2021-11-14 20:16:29,569 WARN hdfs.DataStreamer: Excluding datanode DatanodeInfoWithStorage[192.168.192.11:50010,DS-3c3a744b-d53c-4cb5-97ac-4dd3e128f6a7,DISK]
2021-11-14 20:16:29,588 WARN hdfs.DataStreamer: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /kafka/last-test-5/14-11-21/sensor-data.1636917373340.tmp could only be written to 0 of the 1 minReplication nodes. There are 3 datanode(s) running and 3 node(s) are excluded in this operation.
    at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2219)
    at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2789)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:892)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:574)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:999)
    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:927)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2915)

    at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1573)
    at org.apache.hadoop.ipc.Client.call(Client.java:1519)
    at org.apache.hadoop.ipc.Client.call(Client.java:1416)
    at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:242)
    at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:129)
    at jdk.proxy2/jdk.proxy2.$Proxy14.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:530)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
    at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
    at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
    at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
    at jdk.proxy2/jdk.proxy2.$Proxy15.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1084)
    at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1898)
    at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1700)
    at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:707)
2021-11-14 20:16:29,590 WARN hdfs.DFSClient: Error while syncing

The data is written to the HDFS but the generated file(s) have a size of 0 byte and there is no content in the files available.数据写入 HDFS,但生成的文件大小为 0 字节,文件中没有可用内容。

Does anyone know what causes this error and how to fix it?有谁知道导致此错误的原因以及如何解决?

Docker images used for this project: Docker 用于此项目的图像:

  • bde2020 (for Hadoop) bde2020(用于 Hadoop)
  • bitnami (for Kafka & Zookeeper) bitnami(用于 Kafka 和 Zookeeper)

To reproduce this issue I created a git repo where you can pull the project to recreate the error, https://github.com/Benjaminbakir/Big-data-test为了重现此问题,我创建了一个 git 存储库,您可以在其中拉取项目以重新创建错误, https://github.com/Benjaminbakir/Big-data-test

You will have to install Flume on your local machine as well to run the agent.conf file.您还必须在本地计算机上安装 Flume 才能运行 agent.conf 文件。

The file can be runned with the command (you will have to cd to the directory where the config file is stored): flume-ng agent -c.该文件可以使用以下命令运行(您必须 cd 到存储配置文件的目录):flume-ng agent -c。 -f agent.conf --name agent -Xmx512m -f agent.conf --name 代理-Xmx512m

Finally, you need to add the following to your etc/hosts file:最后,您需要将以下内容添加到您的 etc/hosts 文件中:

  • 127.0.0.1 localhost namenode datanode1 datanode2 datanode3 127.0.0.1 localhost namenode datanode1 datanode2 datanode3
  • ::1 localhost namenode datanode1 datanode2 datanode3 ::1 localhost namenode datanode1 datanode2 datanode3

When you now send a message with a Kafka producer to a topic named "test" the error should show up.现在,当您使用 Kafka 生产者向名为“test”的主题发送消息时,错误应该会出现。

Command to create a Kafka topic: /opt/bitnami/kafka/bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --topic test --partitions 3 --replication-factor 1创建 Kafka 主题的命令:/opt/bitnami/kafka/bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --topic test --partitions 3 --replication-factor 1

Command to create a producer: $KAFKA_HOME/opt/bitnami/kafka/bin/kafka-console-producer.sh --broker-list kafka:9092 --topic=test创建生产者的命令:$KAFKA_HOME/opt/bitnami/kafka/bin/kafka-console-producer.sh --broker-list kafka:9092 --topic=test

Please let me know if anything is not clear enough, I will try to explain it more in detail then.如果有什么不够清楚的地方,请告诉我,我会尝试更详细地解释。

PS: The hadoop cluster is healthy, the datanodes and namenode are running and the user can download/upload files manually via the Hadoop web UI, but when data is send via Kafka & Flume this error occurs. PS:hadoop集群是健康的,datanodes和namenode正在运行,用户可以通过Hadoop web UI手动下载/上传文件,但是当通过Kafka & Flume发送数据时会出现这个错误。

you just need to do some "port proxy",like forwarding ip:port to other_ip:port你只需要做一些“端口代理”,比如转发 ip:port 到 other_ip:port

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM