简体   繁体   中英

Right way to form an JSON file over iteration

I need to write a couple of URLs into a JSON file. This is what I have done so far;

 for index, document in enumerate(master_data):

            # create a dictionary for each document in the master list
            document_dict = {}
            document_dict['cik_number'] = document[0]
            document_dict['company_name'] = document[1]
            document_dict['form_id'] = document[2]
            document_dict['date'] = document[3]
            document_dict['file_url'] = document[4]

            master_data[index] = document_dict

 for document_dict in master_data:

        # if it's a 10-K document pull the url and the name.
        if document_dict['form_id'] == '10-K':

            # get the components
            data = {}
            data['company name'] = document_dict['company_name']
            data['cik_number'] = document_dict['cik_number']
            data['form_id'] = document_dict['form_id']
            data['date'] = document_dict['date']
            data['file_url'] = document_dict['file_url']

            write(data, JSON_file_to_write)
            JSON_file_to_write.write('\n')

And this is what I get at the end;

{"company name": "ZERO CORP", "cik_number": "109284", "form_id": "10-K", "date": "19940629", "file_url": "https://www.sec.gov/Archives/data/109284/0000898430-94-000468.txt"}
{"company name": "FOREST LABORATORIES INC", "cik_number": "109563", "form_id": "10-K", "date": "19940628", "file_url": "https://www.sec.gov/Archives/data/38074/0000038074-94-000021.txt"}
{"company name": "GOULDS PUMPS INC", "cik_number": "14637", "form_id": "10-K", "date": "19940331", "file_url": "https://www.sec.gov/Archives/data/42791/0000042791-94-000002.txt"}
{"company name": "GENERAL HOST CORP", "cik_number": "275605", "form_id": "10-Q", "date": "19940701", "file_url": "https://www.sec.gov/Archives/data/40638/0000950124-94-001209.txt"}

But as my understanding, I have created a text file containing JSON files. But I need to create a single JSON file so that I can iterate over those file_url's and download them into a folder.

I am really really new into Python and JSON and coding actually, and I'm kind of trying to get done assignment for my Capstone project, so please don't be harsh if I asked my question in the wrong format and please show me the right way so I can learn and have better communication with everyone in this portal.

You want to append it to a list instead of writing immediately to the file. Then you convert your list of dicts to a JSON object and write that to the file.

import json

jsonList = []
for document_dict in master_data:

    # if it's a 10-K document pull the url and the name.
    if document_dict['form_id'] == '10-K':
        # get the components
        data = {}
        data['company name'] = document_dict['company_name']
        data['cik_number'] = document_dict['cik_number']
        data['form_id'] = document_dict['form_id']
        data['date'] = document_dict['date']
        data['file_url'] = document_dict['file_url']
        jsonList.append(data)

with open('filename.txt', 'w') as the_file: #change w to a if you want to append
    the_file.write(json.dumps(jsonList))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM