简体   繁体   中英

Convert Nested JSON list API data into CSV using PYTHON

Want to convert Sample JSON data into CSV file using python. I am retrieving JSON data from API. As my JSON has nested objects, so it normally cannot be directly converted to CSV.I don't want to do any hard coding and I want to make a python code fully dynamic.

So, I have written a function that flatten my JSON Data but I am not able to work out how to iterate all records, finding relevant column names and then output those data into CSV.

In the Sample JSON file I have mentioned only 2 records but in actual there are 100 records. Sample JSON Look like this:

[
   {
      "id":"Random_Company_57",
      "unid":"75",
      "fieldsToValues":{
         "Email":"None",
         "occupation":"SO1 Change",
         "manager":"None",
         "First Name":"Bells",
         "employeeID":"21011.0",
         "loginRequired":"true",
         "superUser":"false",
         "ldapSuperUser":"false",
         "archived":"true",
         "password":"None",
         "externalUser":"false",
         "Username":"Random_Company_57",
         "affiliation":"",
         "Phone":"+16 22 22 222",
         "unidDominoKey":"",
         "externalUserActive":"false",
         "secondaryOccupation":"SO1 Change",
         "retypePassword":"None",
         "Last Name":"Christmas"
      },
      "hierarchyFieldAccess":[
         
      ],
      "userHierarchies":[
         {
            "hierarchyField":"Company",
            "value":"ABC Company"
         },
         {
            "hierarchyField":"Department",
            "value":"gfds"
         },
         {
            "hierarchyField":"Project",
            "value":"JKL-SDFGHJW"
         },
         {
            "hierarchyField":"Division",
            "value":"Silver RC"
         },
         {
            "hierarchyField":"Site",
            "value":"SQ06"
         }
      ],
      "locale":{
         "id":1,
         "dateFormat":"dd/MM/yyyy",
         "languageTag":"en-UA"
      },
      "roles":[
         "User"
      ],
      "readAccessRoles":[
         
      ],
      "preferredLanguage":"en-AU",
      "prefName":"Christmas Bells",
      "startDate":"None",
      "firstName":"Bells",
      "lastName":"Christmas",
      "fullName":"Christmas Bells",
      "lastModified":"2022-02-22T03:47:41.632Z",
      "email":"None",
      "docNo":"None",
      "virtualSuperUser":false
   },
   {
      "id":"xyz.abc@safe.net",
      "unid":"98",
      "fieldsToValues":{
         "Email":"xyz.abc@safe.net",
         "occupation":"SO1 Change",
         "manager":"None",
         "First Name":"Bells",
         "employeeID":"21011.0",
         "loginRequired":"false",
         "superUser":"false",
         "ldapSuperUser":"false",
         "archived":"false",
         "password":"None",
         "externalUser":"false",
         "Username":"xyz.abc@safe.net",
         "affiliation":"",
         "Phone":"+16 2222 222 222",
         "unidDominoKey":"",
         "externalUserActive":"false",
         "secondaryOccupation":"SO1 Change",
         "retypePassword":"None",
         "Last Name":"Christmas"
      },
      "hierarchyFieldAccess":[
         
      ],
      "userHierarchies":[
         {
            "hierarchyField":"Company",
            "value":"ABC Company"
         },
         {
            "hierarchyField":"Department",
            "value":"PUHJ"
         },
         {
            "hierarchyField":"Project",
            "value":"RPOJ-SDFGHJW"
         },
         {
            "hierarchyField":"Division",
            "value":"Silver RC"
         },
         {
            "hierarchyField":"Site",
            "value":"SQ06"
         }
      ],
      "locale":{
         "id":1,
         "dateFormat":"dd/MM/yyyy",
         "languageTag":"en-UA"
      },
      "roles":[
         "User"
      ],
      "readAccessRoles":[
         
      ],
      "preferredLanguage":"en-AU",
      "prefName":"Christmas Bells",
      "startDate":"None",
      "firstName":"Bells",
      "lastName":"Christmas",
      "fullName":"Christmas Bells",
      "lastModified":"2022-03-16T05:04:13.085Z",
      "email":"xyz.abc@safe.net",
      "docNo":"None",
      "virtualSuperUser":false
   }
]

What I have tried.

def flattenjson(b, delim):
    val = {}
    for i in b.keys():
        if isinstance(b[i], dict):
            get = flattenjson(b[i], delim)
            for j in get.keys():
                val[i + delim + j] = get[j]
        else:
            val[i] = b[i]
    print(val)        
    return val
    
json=[{Sample JSON String that mentioned above}]
flattenjson(json,"__")

I don't know it is a right way to deal this problem or not? My final aim is that all the above json data will output in a csv file.

Based on this answer, you could loop through your list of json data and flatten each json with the given function (they always have the same structure?), then build a DataFrame and write the data to csv. That's the easiest way I can think of, try this:

import pandas as pd
import json
import collections

def flatten(dictionary, parent_key=False, separator='__'):

    items = []
    for key, value in dictionary.items():
        new_key = str(parent_key) + separator + key if parent_key else key
        if isinstance(value, collections.MutableMapping):
            items.extend(flatten(value, new_key, separator).items())
        elif isinstance(value, list):
            for k, v in enumerate(value):
                items.extend(flatten({str(k): v}, new_key).items())
        else:
            items.append((new_key, value))
    return dict(items)
    
with open('your_json.json') as f:
    data = json.load(f) # data is a the example you provided (list of dicts)
    
all_records=[]
for jsn in data:
    tmp = flatten(jsn)
    all_records.append(tmp)
    
df = pd.DataFrame(all_records)
out = df.to_csv('json_to_csv.csv')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM