使用boto将文件从ec2传输到s3时出错

Question

I am following this procedure link to upload my mongodump to s3. 我按照这个程序链接将我的mongodump上传到s3。

bash script bash脚本

#!/bin/sh

MONGODB_SHELL='/usr/bin/mongo'

DUMP_UTILITY='/usr/bin/mongodump'
DB_NAME='amicus'

date_now=`date +%Y_%m_%d_%H_%M_%S`
dir_name='db_backup_'${date_now}
file_name='db_backup_'${date_now}'.bz2'

log() {
    echo $1
}

do_cleanup(){
    rm -rf db_backup_2010* 
    log 'cleaning up....'
}

do_backup(){
    log 'snapshotting the db and creating archive' && \
    ${MONGODB_SHELL} admin fsync_lock.js && \
    ${DUMP_UTILITY} -d ${DB_NAME} -o ${dir_name} && tar -jcf $file_name ${dir_name}
    ${MONGODB_SHELL} admin unlock.js && \
    log 'data backd up and created snapshot'
}

save_in_s3(){
    log 'saving the backup archive in amazon S3' && \
    python aws_s3.py set ${file_name} && \
    log 'data backup saved in amazon s3'
}

do_backup && save_in_s3 && do_cleanup

aws_s3.py aws_s3.py

ACCESS_KEY=''
SECRET=''
BUCKET_NAME='s3:///s3.amazonaws.com/database-backup' #note that you need to create this bucket first

from boto.s3.connection import S3Connection
from boto.s3.key import Key

def save_file_in_s3(filename):
    conn = S3Connection(ACCESS_KEY, SECRET)
    bucket = conn.get_bucket(BUCKET_NAME)
    k = Key(bucket)
    k.key = filename
    k.set_contents_from_filename(filename)

def get_file_from_s3(filename):
    conn = S3Connection(ACCESS_KEY, SECRET)
    bucket = conn.get_bucket(BUCKET_NAME)
    k = Key(bucket)
    k.key = filename
    k.get_contents_to_filename(filename)

def list_backup_in_s3():
    conn = S3Connection(ACCESS_KEY, SECRET)
    bucket = conn.get_bucket(BUCKET_NAME)
    for i, key in enumerate(bucket.get_all_keys()):
        print "[%s] %s" % (i, key.name)

def delete_all_backups():
    #FIXME: validate filename exists
    conn = S3Connection(ACCESS_KEY, SECRET)
    bucket = conn.get_bucket(BUCKET_NAME)
    for i, key in enumerate(bucket.get_all_keys()):
        print "deleting %s" % (key.name)
        key.delete()

if __name__ == '__main__':
    import sys
    if len(sys.argv) < 3:
        print 'Usage: %s <get/set/list/delete> <backup_filename>' % (sys.argv[0])
    else:
        if sys.argv[1] == 'set':
            save_file_in_s3(sys.argv[2])
        elif sys.argv[1] == 'get':
            get_file_from_s3(sys.argv[2])
        elif sys.argv[1] == 'list':
            list_backup_in_s3()
        elif sys.argv[1] == 'delete':
            delete_all_backups()
        else:
            print 'Usage: %s <get/set/list/delete> <backup_filename>' % (sys.argv[0])

But keep getting this error: 但不断收到此错误：

Traceback (most recent call last):
  File "aws_s3.py", line 42, in <module>
    save_file_in_s3(sys.argv[2])
  File "aws_s3.py", line 13, in save_file_in_s3
    k.set_contents_from_filename(filename)
  File "/usr/local/lib/python2.7/dist-packages/boto/s3/key.py", line 1362, in set_contents_from_filename
    encrypt_key=encrypt_key)
  File "/usr/local/lib/python2.7/dist-packages/boto/s3/key.py", line 1293, in set_contents_from_file
    chunked_transfer=chunked_transfer, size=size)
  File "/usr/local/lib/python2.7/dist-packages/boto/s3/key.py", line 750, in send_file
    chunked_transfer=chunked_transfer, size=size)
  File "/usr/local/lib/python2.7/dist-packages/boto/s3/key.py", line 951, in _send_file_internal
    query_args=query_args
  File "/usr/local/lib/python2.7/dist-packages/boto/s3/connection.py", line 664, in make_request
    retry_handler=retry_handler
  File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 1071, in make_request
    retry_handler=retry_handler)
  File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 1030, in _mexe
    raise ex
socket.error: [Errno 104] Connection reset by peer

Did a bit of my research and found out that its some kind of bug in boto .How to proceed further with this? 做了一点我的研究，发现它是boto某种bug。如何继续这个？

Answer 1

As I didn't get any update how to make it work I used s3cmd in my bash script. 由于我没有得到任何更新如何使其工作，我在我的bash脚本中使用了s3cmd 。 But I have to still test it for files >1gb. 但我还是要测试文件> 1gb。

Here is the updated code - 这是更新的代码 -

#!/bin/sh

MONGODB_SHELL='/usr/bin/mongo'

DUMP_UTILITY='/usr/bin/mongodump'
DB_NAME='amicus'

date_now=`date +%Y_%m_%d_%H_%M_%S`
dir_name='db_backup_'${date_now}
file_name='db_backup_'${date_now}'.bz2'

log() {
    echo $1
}

do_cleanup(){
    rm -rf db_backup_2010* 
    log 'cleaning up....'
}

do_backup(){
    log 'snapshotting the db and creating archive' && \

    ${DUMP_UTILITY} -d ${DB_NAME} -o ${dir_name} && tar -jcf $file_name ${dir_name}

    log 'data backd up and created snapshot'
}

save_in_s3(){
    log 'saving the backup archive in amazon S3' && \
    python aws_s3.py set ${file_name} && \
    s3cmd put ${file_name} s3://YOURBUCKETNAME
    log 'data backup saved in amazon s3'
}

do_backup && save_in_s3 && do_cleanup

Answer 2

This probably has to do with the size of the file uploaded. 这可能与上传文件的大小有关。

"Connection reset by peer" usually means that the remote server closed the connection (don't think it's a boto problem). “通过对等方重置连接”通常意味着远程服务器关闭了连接（不要认为这是一个boto问题）。 Also going to guess this is some sort of timeout you are hitting for making the request (for a large file the transfer takes a lot of time). 另外猜测这是你要求发出请求的某种超时（对于大文件，传输需要花费很多时间）。

I would recommend to look into multipart uploads if you want to do this yourself. 如果你想自己这样做，我建议你研究分段上传。 See this example: https://gist.github.com/chrishamant/1556484 请参阅此示例： https ： //gist.github.com/chrishamant/1556484

s3cmd does this in the back based on the file size. s3cmd根据文件大小在后面执行此操作。

使用boto将文件从ec2传输到s3时出错

问题描述

2 个解决方案

解决方案1
3 已采纳 2015-12-14 08:55:07

解决方案2
0 2015-12-15 16:50:49

使用boto将文件从ec2传输到s3时出错

问题描述

2 个解决方案

解决方案1 3 已采纳 2015-12-14 08:55:07

解决方案2 0 2015-12-15 16:50:49

解决方案1
3 已采纳 2015-12-14 08:55:07

解决方案2
0 2015-12-15 16:50:49