简体   繁体   English

python - gzip 字符串并上传到 s3

[英]python - gzip string and upload to s3

I have written a snippet to download file from s3 and modify some xml data, then upload it back into s3.我写了一个片段来从 s3 下载文件并修改一些 xml 数据,然后将其上传回 s3。 The data is gzip so I unzip it first and then modify and gzip it back.数据是 gzip,所以我先解压缩,然后修改并 gzip 回来。 I see the gzip returns some data (def not length 0) why does the upload does this?我看到 gzip 返回一些数据(def 不是长度 0)为什么上传会这样?

    s3Key='test'
    try:
        bytes_buffer = io.BytesIO()
        s3.download_fileobj(Bucket=bucketName, Key=s3Key, Fileobj=bytes_buffer)
        byte_value = BytesIO(bytes_buffer.getvalue())
        gzipfile = GzipFile(fileobj=byte_value)
        content = gzipfile.read()
        xml = et.fromstring(content)
        for specialrequest in xml.xpath("(//*[local-name()='{}'])".format(nodeName)):
            # perform regex
            value = specialrequest.text
            value = 'test_replacement'
            specialrequest.text = value
        xml = et.tostring(xml)
        byte_value = StringIO()
        with GzipFile(fileobj=byte_value, mode="w") as f:
            f.write(xml)
        #s3.upload_fileobj(io.BytesIO(byte_value), bucketName, s3Key)
        response = s3.put_object(Body=byte_value.getvalue(), Bucket=bucketName, Key=s3Key)
        print(response)
    #print(byte_value.getvalue())
    except Exception:
        print "Unexpected error:", sys.exc_info()[0]
        pass

The put is successful but the content length always result in 0放置成功但内容长度始终为 0

{u'VersionId': 'mHZJAS6b2ordFx802D4egd56VFZjACOI', u'ETag': '"5d8fa27c1e14fee5d12c6856cc0c2074"', 'ResponseMetadata': {'HTTPStatusCode': 200, 'RetryAttempts': 0, 'HostId': 'Ig2nK1VtgURwGIHXXF8cgYqoUPrY/jW3ilhI8so9E9T0AKUn5Q3FX0IfrDsHanxqXS/4kO9Dje4=', 'RequestId': '1PY7DFWE37CACEM9', 'HTTPHeaders': {'content-length': '0', 'x-amz-id-2': 'Ig2nK1VtgURwGIHXXF8cgYqoUPrY/jW3ilhI8so9E9T0AKUn5Q3FX0IfrDsHanxqXS/4kO9Dje4=', 'server': 'AmazonS3', 'x-amz-request-id': '1PY7DFWE37CACEM9', 'etag': '"5d8fa27c1e14fee5d12c6856cc0c2074"', 'date': 'Tue, 22 Jun 2021 02:34:48 GMT', 'x-amz-version-id': 'mHZJAS6b2ordFx802D4egd56VFZjACOI'}}}

EDIT:编辑:

After using zlib to compress instead - I was able to upload the file with the expected file size (same as the gzip downloaded), however, when trying to unzip it locally to validate the data, it keeps turning it into cpgz for some reason在使用 zlib 进行压缩之后 - 我能够上传具有预期文件大小的文件(与下载的 gzip 相同),但是,当尝试在本地解压缩以验证数据时,由于某种原因它一直将其转换为 cpgz

xml = et.tostring(xml)
compressed = zlib.compress(str.encode(xml))
response = s3.put_object(Body=compressed, Bucket=bucketName, Key=s3Key)

Try this while assuming that the xml is same as the root object:在假设 xml 与根 object 相同的情况下尝试此操作:

import xml.etree.ElementTree as ET
import boto3

xml_string = ET.tostring(root, encoding='utf=8').decoding('utf8')
print(xml_string) # Optional

xml_byte = bytes(xml_string,'utf8')# gzip compress take bytes and not string

gzip_compressed = gzip.compress(xml_byte)

s3 = boto3.client('s3')
response = s3.put_object(Body=gzip_compressed, Bucket=bucketName, Key=s3Key)

if response:
      print("file uploaded successfully") 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM