简体   繁体   中英

Load json file with multiple json objects from GCS bucket in python and process them one at a time

I have 1 json file in GCS bucket with below content. fruit.json

{
    "fruit": "Mango",
    "size": "Large",
    "color": "Yellow"
}
{
    "fruit": "Apple",
    "size": "medium",
    "color": "Red"
}
{
    "fruit": "Grapes",
    "size": "small",
    "color": "Green"
}

In Python i want to process all those json object from that file, one at a time. How can I achieve that? I have tried below but it doesnt seems to be working:

storage_client = storage.Client()
bucket = storage_client.get_bucket('my-buckket_main1')
blob = bucket.blob('fruit.json')

objectList = []
fruitjson = json.loads(blob.download_as_string(client=None))
print(fruitjson)
print("Started Reading JSON file which contains multiple JSON document")
# with open(blob) as f:
for jsonObj in fruitjson:
    objectDict = json.loads(jsonObj)
    objectList.append(objectDict)

print("Printing each JSON Decoded Object")
for fruits in objectList:
    print(fruits["fruit"], fruits["size"], fruits["color"])

How can i process each json document in a file, one at a time to publish as a event to pubsub?

I have found the solution for my question asked above. blob.download_to_filename(destination_uri) did the job for me:

destination_uri = '{}/{}'.format(folder, 'newSample.txt')
blob.download_to_filename(destination_uri)

fruitList = []

with open(destination_uri) as f:
     for jsonObject in f:
         fruitDict = json.loads(jsonObject)
         fruitList.append(fruitDict)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM