简体   繁体   中英

Amazon S3 - download failures with Boto

I started to experience a lot of problems with file download from S3 after moving from 12.04 to Ubuntu 14.04. At about 1/20 cases boto fails to download the file and stucks for 1-2 minutes before throwing an exception.

Does not reproduce for very small files, only for medium size and large files.

I wrote a simple python script to test this:

import datetime
from boto.s3.connection import S3Connection

success = 0
for i in xrange(1000000):
    try:
        start = datetime.datetime.now()
        s3conn = S3Connection(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)
        bucket = s3conn.get_bucket(bucket_name)
        key = bucket.get_key(path)
        content = key.get_contents_as_string()
        delta = datetime.datetime.now() - start
        print 'Downloading completed in', delta.total_seconds(), 's, file size is', len(content), 'bytes'
        success += 1
        print 'Downloaded', i + 1, 'files, success rate: ', float(success) / (i + 1)
    except Exception as exc:
        print 'Error occurred:', exc

Here is some output of this script from my Ubuntu 14.04 machine:

Downloading completed in 1.76665 s, file size is 996320 bytes
Downloaded 1 files, success rate:  1.0
Downloading completed in 7.709181 s, file size is 996320 bytes
Downloaded 2 files, success rate:  1.0
Downloading completed in 1.762192 s, file size is 996320 bytes
Downloaded 3 files, success rate:  1.0
Downloading completed in 7.670499 s, file size is 996320 bytes
Downloaded 4 files, success rate:  1.0
Downloading completed in 1.806259 s, file size is 996320 bytes
Downloaded 5 files, success rate:  1.0
Downloading completed in 1.992967 s, file size is 996320 bytes
Downloaded 6 files, success rate:  1.0
...
...
...
Downloading completed in 6.496797 s, file size is 996320 bytes
Downloaded 21 files, success rate:  1.0
Error occurred: [Errno 104] Connection reset by peer
Downloading completed in 2.31506 s, file size is 996320 bytes
Downloaded 23 files, success rate:  0.95652173913
Error occurred: The read operation timed out
Error occurred: The read operation timed out
Downloading completed in 1.963559 s, file size is 996320 bytes
Downloaded 26 files, success rate:  0.884615384615
Downloading completed in 1.395313 s, file size is 996320 bytes
Downloaded 27 files, success rate:  0.888888888889
Downloading completed in 1.416122 s, file size is 996320 bytes
Downloaded 28 files, success rate:  0.892857142857
Downloading completed in 1.168238 s, file size is 996320 bytes
Downloaded 29 files, success rate:  0.896551724138
Downloading completed in 1.30582 s, file size is 996320 bytes
Downloaded 30 files, success rate:  0.9

I tried this script on Windows and Mac sitting in the same local network and the result is 100% fine! Also, I had no issues on my 12.04 Amazon EC2 instance:

...
Downloading completed in 2.015681 s, file size is 996320 bytes
Downloaded 100 files, success rate:  1.0

Did anyone face similar issues? Where do I look at? I tried to debug boto library but without success. The important thing is that I have no problems with downloading when I use other methods of file download on this machine, only boto fails. Tried different boto versions: 2.15.0 and 2.34.0

Turns out this has nothing to do with boto as I was later able to reproduce it with curl .

Fixed the problem for myself by moving data from European S3 region to "US Standard" region, but still interested how this may work this way. All files are downloaded perfectly on one machine in local network and on another machine in the same net - 10-20% of failures.

Will address this to Amazon if this is going bother me more.

When creating a connection, you should specify the region otherwise it might timeout because it might try with another region.

conn = boto.s3.connect_to_region(aws_region, **creds)

where aws_region is a string and creds is a dictionary of your credentials.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM