简体   繁体   中英

Getting error when copying file from s3:// to local(hadoop) file system

I am trying to copy the files from s3 to hadoop file system using python. I got the following error:

cp: `foo/ds=2015-02-13/ip-d1b-request-2015-02-13_10-00_10-09.txt.gz': No such file or directory

I am recently migrating latest hadoop version(2.4.0). In version(0.20) is working fine. Why I m getting this error in the 2.4.0 version?

In Hadoop version 0.20

hadoop@ip-10-76-38-167:~$ /home/hadoop/bin/hadoop fs -cp s3://test.com/foo/ds=2015-02-13/ip-d1b-request-2015-02-13_10-00_10-09.txt.gz /foo/ds=2015-02-13/ip-d1b-request-2015-02-13_10-00_10-09.txt.gz

15/02/13 11:21:45 INFO s3native.NativeS3FileSystem: Opening 's3://test.com/foo/ds=2015-02-13/ip-d1b-request-2015-02-13_10-00_10-09.txt.gz' for reading

In Hadoop version 2.4.0

[hadoop@ip-10-169-19-123 ~]$ /home/hadoop/bin/hadoop fs -cp s3://test.com/foo/ds=2015-02-13/ip-d1b-request-2015-02-13_10-00_10-09.txt.gz /foo/ds=2015-02-13/ip-d1b-request-2015-02-13_10-00_10-09.txt.gz

15/02/13 11:21:37 INFO guice.EmrFSBaseModule: Consistency disabled, using com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem as FileSystem implementation.

15/02/13 11:21:38 INFO fs.EmrFileSystem: Using com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem as filesystem implementation

cp: `foo/ds=2015-02-13/ip-d1b-request-2015-02-13_10-00_10-09.txt.gz': No such file or directory

You need to try like this . Add "

/home/hadoop/bin/hadoop fs -cp "s3://test.com/foo/ds=2015-02-13/ip-d1b-request-2015-02-13_10-00_10-09.txt.gz" "/foo/ds=2015-02-13/ip-d1b-request-2015-02-13_10-00_10-09.txt.gz"

I found the answer my own.

Using `distcp` instead of `fs -cp`.

This command works without any issues.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM