简体   繁体   中英

Convert JSON data to CSV with specific format using Python

I start in python and I try to convert an API response in JSON format to csv. Below, a sample of JSON structure from the API.

{
  'database': 'test_db',
  'results': [
    {
      'information': {
        'ID': 0,
        'owners': [
          'Me'
        ]
      },
      'id': '2021072000001',
      'metadata': {
        'Structure': [
          {
            'id': 'S2021072000001',
            'name': 'Col_1',
            'type': 'Column'
          },
          {
            'id': 'S2021072000002',
            'name': 'Col_2',
            'type': 'Column'
          },
          {
            'id': 'S2021072000003',
            'name': 'Key_1',
            'type': 'Key'
          }
        ]
      },
      'name': 'toto',
      'type': 'Table'
    }
  ],
  'results_sum': 1
}

I want to convert it to a csv file and obtain this result : csv format

information.owners  | id            | name      | type      | metadata.structure.id | metadata.structure.name   | metadata.structure.type
---------------------------------------------------------------------------------------------------------------------------------------------- 
Me                  | 2021072000001 | toto      | Table     | S2021072000001        | Col_1                     | Column
Me                  | 2021072000001 | toto      | Table     | S2021072000002        | Col_2                     | Column
Me                  | 2021072000001 | toto      | Table     | S2021072000003        | Key_1                     | Key

Below my program :

f = open('C:/Documents/results.csv','w', newline='', encoding='utf-8-sig') 
fCSV = csv.writer(f, delimiter='|')
fCSV.writerow(['information.owners','id','name','type','metadata.structure.id','metadata.structure.name','metadata.structure.type])

for item in res['results']:
    objectOwners = item['information']['owners']
    objectId = item['id']
    objectName = item['name']
    objectType = item['type']
    fCSV.writerow([objectOwners,objectId,objectName,objectTypek])  

This program works but if I add lines to catch informations about

  • structureId
  • structureName
  • structureType

It doesn't work.

Thank you for your help

You'll need to walk each nested structure item while writing those rows:

writer = csv.writer(...)
for item in res["results"]:
    objectOwners = ";".join(item["information"]["owners"])
    objectId = item["id"]
    objectName = item["name"]
    objectType = item["type"]
    for structure_item in item["metadata"]["Structure"]:
        sid = structure_item["id"]
        sname = structure_item["name"]
        stype = structure_item["type"]
        writer.writerow([objectOwners, objectId, objectName, objectType, sid, sname, stype])

outputs

Me,2021072000001,toto,Table,S2021072000001,Col_1,Column
Me,2021072000001,toto,Table,S2021072000002,Col_2,Column
Me,2021072000001,toto,Table,S2021072000003,Key_1,Key

Pandas .json_normalize() can help you. In this case it gets a little complicated, but I use the normalise two times and then merge the two data frames together.

import pandas as pd    

df = pd.json_normalize(
        j["results"],
        meta=["name", "type", "id"],
        record_path=["metadata", "Structure"],
        record_prefix="metadata.structure.",
    ).merge(
        pd.json_normalize(
            j["results"],
            meta=["id"],
            record_path=["information", "owners"],
        ).rename(columns=({0: "information.owners"})),
        on="id",
    )

Output (sequence of columns a little different than your example, but data are the same):

  metadata.structure.id metadata.structure.name metadata.structure.type  name  \
0        S2021072000001                   Col_1                  Column  toto   
1        S2021072000002                   Col_2                  Column  toto   
2        S2021072000003                   Key_1                     Key  toto   

    type             id information.owners  
0  Table  2021072000001                 Me  
1  Table  2021072000001                 Me  
2  Table  2021072000001                 Me  

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM