简体   繁体   English

将 S3 存储桶中的 gzip 文件提取到另一个 S3 存储桶

[英]Extracting gzip file in an S3 bucket to another S3 Bucket

im trying to copy the gzip file from one S3 bucket and extract its content to another S3 bucket using gzip library.我试图从一个 S3 存储桶复制 gzip 文件,并使用 gzip 库将其内容提取到另一个 S3 存储桶。 im getting an error我收到一个错误

Seek from end not supported不支持从末端查找

import boto3, json
from io import BytesIO
import gzip

def lambda_handler():
    try:
        s3 = boto3.resource('s3')
        copy_source = {
            'Bucket': 'srcbucket',
            'Key': 'samp.gz'
            }
        bucket = s3.Bucket('destbucket')
        bucketSrc = s3.Bucket('srcbucket')

        s3Client = boto3.client('s3', use_ssl=False)

        s3Client.upload_fileobj(                      # upload a new obj to s3
            Fileobj=gzip.GzipFile(              # read in the output of gzip -d
                None,                           # just return output as BytesIO
                'rb',                           # read binary
                fileobj=BytesIO(s3Client.get_object(Bucket='srcbucket', Key='samp.gz')['Body'].read())),
            Bucket='destbucket',                      # target bucket, writing to
            Key="")               # target key, writing to

    except Exception as e:
        print(e)

You can't unzip the ZIP file and upload its constituent files the way you're trying to.您无法解压缩 ZIP 文件并按照您尝试的方式上传其组成文件。

You could unzip the entire ZIP file to Lambda local disk in /tmp (note this has a limit of 512MB diskspace) then upload file by file.您可以将整个 ZIP 文件解压缩到/tmp Lambda 本地磁盘(注意这有 512MB 磁盘空间的限制),然后逐个文件上传。 Or, if it will not fit on disk or you prefer not to persist to desk, then you can stream the contents of the ZIP file into memory, file by file, and then upload each stream to S3).或者,如果它不适合磁盘,或者您不想保留在桌面上,那么您可以将 ZIP 文件的内容逐个文件地流式传输到内存中,然后将每个流上传到 S3)。 In both solutions, you will need to supply an appropriate key for each and every upload.在这两种解决方案中,您都需要为每次上传提供适当的密钥。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM