简体   繁体   English

无法使用 AWS Lambda 从 AWS S3 读取 .gz 文件

[英]cant read .gz file from AWS S3 using AWS Lambda

i'm trying to read a .gz file from S3 using AWS Lambda.我正在尝试使用 AWS Lambda 从 S3 读取 .gz 文件。

when i try to unzip the file using gzip, i get OSError: Not a gzipped file (b'PK') error.当我尝试使用 gzip 解压缩文件时,出现OSError: Not a gzipped file (b'PK')错误。

Here is the code...这是代码...

retr = s3_client.get_object(Bucket=bucket, Key=key)
print(retr)
bytestream = BytesIO(retr['Body'].read())
print(bytestream)
got_text = GzipFile(mode='rb', fileobj=bytestream).read().decode('utf-8')
print(got_text)

Here is the response of the retr file.这是 retr 文件的响应。

{  
   'ResponseMetadata':{  
      'RequestId':'aklsfjdlskfj',
      'HostId':'kajsdfhdkjfh/+r+i0OLx/adlksfjd/aksfh=',
      'HTTPStatusCode':200,
      'HTTPHeaders':{  
         'x-amz-id-2':'alsdkfjslkfjsalfjflkj/+r+i0OLx/asklfjslk/r9eTM=',
         'x-amz-request-id':'hgfhgf',
         'date':'Sun, 21 Jan 2018 08:35:28 GMT',
         'last-modified':'Sun, 21 Jan 2018 08:32:34 GMT',
         'etag':'"aksjdfhdskjfhfkjhf"',
         'accept-ranges':'bytes',
         'content-type':'application/x-gzip',
         'content-length':'2825',
         'server':'AmazonS3'
      },
      'RetryAttempts':0
   },
   'AcceptRanges':'bytes',
   'LastModified':datetime.datetime(2018,
   1,
   21,
   8,
   32,
   34,
   tzinfo=tzutc()),
   'ContentLength':2825,
   'ETag':'"akjsfhksdjfhsdkfj"',
   'ContentType':'application/x-gzip',
   'Metadata':{  

   },
   'Body':<botocore.response.StreamingBody object at adsfdsf08>
}

How to get that .gz file unzipped?如何解压缩该 .gz 文件? i want to unzip it and read the .txt file in that .gz file.我想解压缩它并读取该 .gz 文件中的 .txt 文件。 Can anyone guide me?任何人都可以指导我吗?

The file you are trying to decompress is not a gzip file.您尝试解压缩的文件不是 gzip 文件。 It's a ZIP file.这是一个 ZIP 文件。

Here's what happens why I try to use the Python gzip module to decompress a ZIP file:以下是我尝试使用 Python gzip 模块解压缩 ZIP 文件的原因:

Python 3.6.4 (v3.6.4:d48eceb, Dec 19 2017, 06:04:45) [MSC v.1900 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import gzip
>>> with gzip.open("ResultList_example2.zip", "rb") as f: data = f.read()
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python36\lib\gzip.py", line 276, in read
    return self._buffer.read(size)
  File "C:\Python36\lib\gzip.py", line 463, in read
    if not self._read_gzip_header():
  File "C:\Python36\lib\gzip.py", line 411, in _read_gzip_header
    raise OSError('Not a gzipped file (%r)' % magic)
OSError: Not a gzipped file (b'PK')

You will need to use the zipfile module instead.您将需要改用zipfile 模块

i was uploading a corrupted .gz file.我正在上传损坏的 .gz 文件。 The code is fine and is working well now.代码很好,现在运行良好。

First make sure its .gz or .zip, if .zip, instead of gzip use zipfile .首先确保它是 .gz 或 .zip,如果是 .zip,则使用zipfile而不是gzip

import zipfile
Fileobj=zipfile.ZipFile(....)

This worked for me on AWS Lambda这在AWS Lambda上对我AWS Lambda

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM