简体   繁体   中英

hadoop -appendToFile on Google Compute Engine cluster

I am having trouble performing the hadoop fs shell command, -appendToFile, on a GCE hadoop cluster. I have had success with other hadoop shell commands ( eg, -cat, -put, -mv) on the GCE cluster. In addition, I am able to use -appendToFile on a different hadoop cluster. However, I am unable to use -appendToFile on the GCE hadoop cluster. Syntax I have tried:

hdfs dfs -appendToFile two.log /tmp/test/one.log

yields:

"appendToFile: Failed to close file /tmp/test/one.log. Lease recovery is in progress. Try again later." Where one.log is an existing file on the hdfs, and two.log is an existing file on the local file system.

In addition:

hadoop fs -appendToFile two.log /tmp/test/one.log

yields many errors beginning with:

java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try.

I am not familiar with these errors, and I believe my syntax use is correct. I have confirmed other hadoop commands are working with similar syntax. And the same commands work on other clusters. Any advice would be appreciated. Thank you!

It appears you may be running into a known issue that apparently impacts the append command most prominently, if you happen to be running a default bdutil or Click-to-Deploy created Hadoop cluster with 2 datanodes, and if dfs.replication is still at its default value of 3 : HDFS-4600 HDFS file append failing in multinode cluster

In a recent bdutil release 1.1.0 , the default dfs.replication is now 2 since default settings are already on Persistent Disk; the replication of 2 is a tradeoff to allow Hadoop to still have greater availability against single-node failures, while the underlying Persistent Disk provides durability. So, if you pick up the latest changes and/or manually set dfs.replication to a lower number, or increase the number of datanodes, append should start working.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM