簡體   English   中英

Python,AWS S3:如何使用 jsons 讀取文件

[英]Python, AWS S3: how to read file with jsons

我在 S3 存儲桶中有一個 JSONs 文件(每行的文件 - json)。 我在正確閱讀它們時遇到了麻煩。我在做什么:

s3 = boto3.client('s3')
response = s3.get_object(Bucket=SOURCE_BUCKET, Key=key)
file = response['Body']
for line in file:
    data_json = json.loads(line, encoding='utf-8')

在這種情況下,它會忽略\n並將一堆文本讀取為一行。

如何正確讀取文件中每一行的所有 json?

輸入文件內容示例(一個包含 json 數作為單獨行的文件):

{"notificationItems":[{"NotificationRequestItem":{"eventCode":"PENDING","AccountCode":"A001US","amount":{"currency":"USD","value":111},"success":"true","method":"xxx","reference":"43535353","date":"2021"}}],"go":"true"}
{"notificationItems":[{"NotificationRequestItem":{"eventCode":"PENDING","AccountCode":"A002US","amount":{"currency":"USD","value":111},"success":"true","method":"xxx","reference":"43535353","date":"2021"}}],"go":"true"}
...
{"notificationItems":[{"NotificationRequestItem":{"eventCode":"PENDING","AccountCode":"A003US","amount":{"currency":"USD","value":111},"success":"true","method":"xxx","reference":"43535353","date":"2021"}}],"go":"true"}

boto3 的get_object返回一個StreamingBody object 作為返回字典Body的值。

object 的方法之一是iter_lines方法,它允許您在讀取響應時遍歷響應的行。 您可以從那里在每一行上調用json.loads

for line in file.iter_lines():
    data = json.loads(line)
    print(data)

獲取 object 返回 aws botocore.response.StreamingBody如果您的 function 不能使用原始字節 stream,您需要執行.read() (請參閱本文檔

response = s3.get_object(Bucket=SOURCE, Key=key)['body'].read()
for line in response:
     json_data = json.loads(line)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM