简体   繁体   中英

Hadoop Docker container could only be replicated to 0 nodes instead of minReplication (=1)

I tried different docker images for Hadoop containers but none of them work when I try to write files to HDFS. I always get error:

Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /t/_temporary/0/_temporary/attempt_20200528153700_0001_m_000006_7/part-00006-34c8bc6d-68a3-4177-bfbf-5f225b28c157-c000.snappy.parquet could only be replicated to 0 nodes instead of minReplication (=1). There are 3 datanode(s) running and 3 node(s) are excluded in this operation.

What I tried so far?

  1. Formatted Namenode as mentioned under similar questions.
  2. Exposed needed ports, 8088, 50070, 9000, 50010.
  3. Already have enough space in Datanode.
  4. Updated host file, mapped 127.0.0.1 to container name.

I'm running app on my local computer, Docker containers running on local as well.

After creating basic Dataframe, I'm trying to write.

df.write.save('hdfs://hadoop-master:9000/t', format='parquet', mode='append'

It took almost 2 minutes, then throws error.

WebUI is fine. I can put file to HDFS with commands in container.

It seems like network/connection problem to me, but couldn't find out.

I didn't solve problem but found a quick solution.

TL;TR

MacOS may cause this problem.

Built new Debian server on GCP, installed docker, its images and Python codes which I tested. It worked fine, but still I am getting error when I try to connect from my local machine.

But still need an answer, I share it for someone who needs quick solution.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM