I am trying to read a file from S3, which has the following content stored in it:
{"empID":{"n":"7"},"name":{"s":"NewEntry"}}
{"empID":{"n":"3"},"name":{"s":"manish"}}
{"empID":{"n":"2"},"name":{"s":"mandeep"}}
{"empID":{"n":"4"},"name":{"s":"Vikas"}}
{"empID":{"n":"1"},"name":{"s":"babbar"}}
I want to iterate over each and every object and do some some processing on them.
I am taking reference from this code:
import json
import boto3
s3_obj =boto3.client('s3')
s3_clientobj = s3_obj.get_object(Bucket='dane-fetterman-bucket', Key='mydata.json')
s3_clientdata = s3_clientobj['Body'].read().decode('utf-8')
print("printing s3_clientdata")
print(s3_clientdata)
print(type(s3_clientdata))
s3clientlist=json.loads(s3_clientdata)
print("json loaded data")
print(s3clientlist)
print(type(s3clientlist))
but there is not any "Body" attribute in the file. Can i get some points to do the desired stuff.
The issue is that the file actually contains individual JSON on each line, rather than being a complete JSON object itself.
Therefore, the program needs to process each line independently:
import json
import boto3
s3_client = boto3.client('s3')
s3_clientobj = s3_client.get_object(Bucket='my-bucket', Key='mydata.json')
for line in s3_clientobj['Body'].iter_lines():
object = json.loads(line)
print(f"ID: {object['empID']['n']} Name: {object['name']['s']}")
Alternatively, you could download the whole object to disk, then just use normal for line in open('file'):
syntax.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.