[英]cant read .gz file from AWS S3 using AWS Lambda
i'm trying to read a .gz file from S3 using AWS Lambda.我正在尝试使用 AWS Lambda 从 S3 读取 .gz 文件。
when i try to unzip the file using gzip, i get OSError: Not a gzipped file (b'PK') error.当我尝试使用 gzip 解压缩文件时,出现OSError: Not a gzipped file (b'PK')错误。
Here is the code...这是代码...
retr = s3_client.get_object(Bucket=bucket, Key=key)
print(retr)
bytestream = BytesIO(retr['Body'].read())
print(bytestream)
got_text = GzipFile(mode='rb', fileobj=bytestream).read().decode('utf-8')
print(got_text)
Here is the response of the retr file.这是 retr 文件的响应。
{
'ResponseMetadata':{
'RequestId':'aklsfjdlskfj',
'HostId':'kajsdfhdkjfh/+r+i0OLx/adlksfjd/aksfh=',
'HTTPStatusCode':200,
'HTTPHeaders':{
'x-amz-id-2':'alsdkfjslkfjsalfjflkj/+r+i0OLx/asklfjslk/r9eTM=',
'x-amz-request-id':'hgfhgf',
'date':'Sun, 21 Jan 2018 08:35:28 GMT',
'last-modified':'Sun, 21 Jan 2018 08:32:34 GMT',
'etag':'"aksjdfhdskjfhfkjhf"',
'accept-ranges':'bytes',
'content-type':'application/x-gzip',
'content-length':'2825',
'server':'AmazonS3'
},
'RetryAttempts':0
},
'AcceptRanges':'bytes',
'LastModified':datetime.datetime(2018,
1,
21,
8,
32,
34,
tzinfo=tzutc()),
'ContentLength':2825,
'ETag':'"akjsfhksdjfhsdkfj"',
'ContentType':'application/x-gzip',
'Metadata':{
},
'Body':<botocore.response.StreamingBody object at adsfdsf08>
}
How to get that .gz file unzipped?如何解压缩该 .gz 文件? i want to unzip it and read the .txt file in that .gz file.
我想解压缩它并读取该 .gz 文件中的 .txt 文件。 Can anyone guide me?
任何人都可以指导我吗?
The file you are trying to decompress is not a gzip file.您尝试解压缩的文件不是 gzip 文件。 It's a ZIP file.
这是一个 ZIP 文件。
Here's what happens why I try to use the Python gzip module to decompress a ZIP file:以下是我尝试使用 Python gzip 模块解压缩 ZIP 文件的原因:
Python 3.6.4 (v3.6.4:d48eceb, Dec 19 2017, 06:04:45) [MSC v.1900 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import gzip
>>> with gzip.open("ResultList_example2.zip", "rb") as f: data = f.read()
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python36\lib\gzip.py", line 276, in read
return self._buffer.read(size)
File "C:\Python36\lib\gzip.py", line 463, in read
if not self._read_gzip_header():
File "C:\Python36\lib\gzip.py", line 411, in _read_gzip_header
raise OSError('Not a gzipped file (%r)' % magic)
OSError: Not a gzipped file (b'PK')
You will need to use the zipfile module instead.您将需要改用zipfile 模块。
i was uploading a corrupted .gz file.我正在上传损坏的 .gz 文件。 The code is fine and is working well now.
代码很好,现在运行良好。
First make sure its .gz or .zip, if .zip, instead of gzip
use zipfile
.首先确保它是 .gz 或 .zip,如果是 .zip,则使用
zipfile
而不是gzip
。
import zipfile
Fileobj=zipfile.ZipFile(....)
This worked for me on AWS Lambda
这在
AWS Lambda
上对我AWS Lambda
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.