[英]Reading contents from gzip file which was available in AWS S3
Reading contents from gzip file in python dataframe which is available in AWS S3.从 python dataframe 中的 gzip 文件中读取内容,该文件在 AWS S3 中可用。
Want to convert dataframe.想转换dataframe。
In case if you are trying to get json data to dataframe Here is the code.如果您尝试将 json 数据转换为 dataframe 这是代码。
import pandas as pd
import boto3
from io import StringIO
import gzip
resource = boto3.resource('s3',aws_access_key_id = '',
aws_secret_access_key = '')
list_keys= []
lst = []
for key in client.list_objects(Bucket='bucket_name',Prefix = 'Folder name')['Contents']:
list_keys.append(key["Key"])
for key in list_keys:
try:
obj = resource.Object("bucket_name", key)
with gzip.GzipFile(fileobj=obj.get()["Body"]) as gzipfile:
temp_data = pd.read_json(StringIO(gzipfile.read().decode('UTF-8')),lines=True)
lst.append(temp_data)
except Exception as e:
pass
df = pd.concat(lst,ignore_index = True)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.