繁体   English   中英

Boto3-禁用自动分段上传

[英]Boto3 - Disable automatic multipart upload

我正在使用不支持MultipartUpload的S3兼容后端。

我有一个陌生的情况,其中某些服务器在我上载文件时可以正常完成,但在其他服务器中,boto3会自动尝试使用MultipartUpload上载文件。 我要上传的文件与用于相同后端,区域/租户,存储桶等的测试文件完全相同...

文档所示,MultipartUpload在需要时自动启用:

  • 文件超过特定大小阈值时自动切换为多部分传输

当它自动切换到MultipartUpload时,以下是一些日志:

自动切换到MultipartUpload时记录:

DEBUG:botocore.hooks:Event request-created.s3.CreateMultipartUpload: calling handler <function enable_upload_callbacks at 0x2b001b8>
DEBUG:botocore.endpoint:Sending http request: <PreparedRequest [POST]>
INFO:botocore.vendored.requests.packages.urllib3.connectionpool:Starting new HTTP connection (1): mytenant.mys3backend.cloud.corp
DEBUG:botocore.vendored.requests.packages.urllib3.connectionpool:"POST /cassandra/samplefile.tgz?uploads HTTP/1.1" 501 None
DEBUG:botocore.parsers:Response headers: {'date': 'Fri, 18 Dec 2015 09:12:48 GMT', 'transfer-encoding': 'chunked', 'content-type': 'application/xml;charset=UTF-8', 'server': 'HCP V7.2.0.26'}
DEBUG:botocore.parsers:Response body:
<?xml version='1.0' encoding='UTF-8'?>
<Error>
  <Code>NotImplemented</Code>
  <Message>The request requires functionality that is not implemented in the current release</Message>
  <RequestId>1450429968948</RequestId>
  <HostId>aGRpLmJvc3RoY3AuY2xvdWQuY29ycDoyNg==</HostId>
</Error>     
DEBUG:botocore.hooks:Event needs-retry.s3.CreateMultipartUpload: calling handler <botocore.retryhandler.RetryHandler object at 0x2a490d0>

不会从其他服务器切换到多部分的日志,而是针对同一文件的日志:

DEBUG:botocore.hooks:Event request-created.s3.PutObject: calling handler <function enable_upload_callbacks at 0x7f436c025500>
DEBUG:botocore.endpoint:Sending http request: <PreparedRequest [PUT]>
INFO:botocore.vendored.requests.packages.urllib3.connectionpool:Starting new HTTP connection (1): mytenant.mys3backend.cloud.corp
DEBUG:botocore.awsrequest:Waiting for 100 Continue response.
DEBUG:botocore.awsrequest:100 Continue response seen, now sending request body.
DEBUG:botocore.vendored.requests.packages.urllib3.connectionpool:"PUT /cassandra/samplefile.tgz HTTP/1.1" 200 0
DEBUG:botocore.parsers:Response headers: {'date': 'Fri, 18 Dec 2015 10:05:25 GMT', 'content-length': '0', 'etag': '"b407e71de028fe62fd9f2f799e606855"', 'server': 'HCP V7.2.0.26'}
DEBUG:botocore.parsers:Response body:

DEBUG:botocore.hooks:Event needs-retry.s3.PutObject: calling handler <botocore.retryhandler.RetryHandler object at 0x7f436be1ecd0>
DEBUG:botocore.retryhandler:No retry needed.

我正在上传文件,如下所示:

connection = boto3.client(service_name='s3',
        region_name='',
        api_version=None,
        use_ssl=True,
        verify=True,
        endpoint_url=url,
        aws_access_key_id=access_key,
        aws_secret_access_key=secret_key,
        aws_session_token=None,
        config=None)
connection.upload_file('/tmp/samplefile.tgz','mybucket','remotefile.tgz')

问题是:

  • 为了避免自动切换到分段上传,默认情况下如何禁用MultipartUpload或增加阈值?
  • 一台服务器使用自动多部分功能而其他服务器不使用同一文件有什么原因吗?

当我在寻找boto3时,遇到了您的问题

文件超过特定大小阈值时自动切换为分段传输?

是的upload_file(来自client / resource / S3Transfer)将自动转换为分段上传,默认阈值大小为8 MB。

如果您不希望使用MultiPart,则不要使用upload_file方法,而应使用put_object方法,而不使用Multipart。

客户端= boto3.client('s3')

client.put_object(Body = open('/ test.csv'),Bucket ='mybucket',Key ='test.csv')

我找到了一种解决方法,如下所示使用S3Transfer和Transferconfig增加了阈值大小:

myconfig = TransferConfig(

    multipart_threshold=9999999999999999, # workaround for 'disable' auto multipart upload
    max_concurrency=10,
    num_download_attempts=10,
)

connection = boto3.client(service_name='s3',
        region_name='',
        api_version=None,
        use_ssl=True,
        verify=True,
        endpoint_url=url,
        aws_access_key_id=access_key,
        aws_secret_access_key=secret_key,
        aws_session_token=None,
        config=None)
transfer=S3Transfer(connection,myconfig)

transfer.upload_file('/tmp/samplefile.tgz','mybucket','remotefile.tgz')

我希望它对某人有帮助

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM