简体   繁体   中英

How to convert nested JSON data to CSV using python?

I have a file consisting of an array containing over 5000 objects. However, I am having trouble converting one particular part of my JSON file into the appropriate columns in CSV format.

Below is an example version of my data file:

{
  "Result": {
    "Example 1": {
      "Type1": [
        {
          "Owner": "Name1 Example",
          "Description": "Description1 Example",
          "Email": "example1_email@email.com",
          "Phone": "(123) 456-7890"
        }
      ]
    },
    "Example 2": {
      "Type1": [
        {
          "Owner": "Name2 Example",
          "Description": "Description2 Example",
          "Email": "example2_email@email.com",
          "Phone": "(111) 222-3333"
        }
      ]
    }
  }
}

Here is my current code:

import csv
import json

json_file='example.json'
with open(json_file, 'r') as json_data:
    x = json.load(json_data)

f = csv.writer(open("example.csv", "w"))

f.writerow(["Address","Type","Owner","Description","Email","Phone"])

for key in x["Result"]:
    type = "Type1"
    f.writerow([key,
                type,
                x["Result"][key]["Type1"]["Owner"],
                x["Result"][key]["Type1"]["Description"],
                x["Result"][key]["Type1"]["Email"],
                x["Result"][key]["Type1"]["Phone"]])

My problem is that I'm encountering this issue:

Traceback (most recent call last):
  File "./convert.py", line 18, in <module>
    x["Result"][key]["Type1"]["Owner"],
TypeError: list indices must be integers or slices, not str

When I try to substitute the last array such as "Owner" to an integer value, I receive this error: IndexError: list index out of range .

When I strictly change the f.writerow function to

f.writerow([key,
                type,
                x["Result"][key]["Type1"]])

I receive the results in a column, but it merges everything into one column, which makes sense. Picture of the output: https://imgur.com/a/JpDkaAT

I would like the results to be separated based on the label into individual columns instead of being merged into one. Could anyone assist?

Thank you!

Type1 in your data structure is a list, not a dict. So you need to iterate over it instead of referencing by key.

for key in x["Result"]:
    # key is now "Example 1" etc.
    type1 = x["Result"][key]["Type1"]
    # type1 is a list, not a dict
    for i in type1:
        f.writerow([key,
                    "Type1",
                    type1["Owner"],
                    type1["Description"],
                    type1["Email"],
                    type1["Phone"]])

The inner for loop ensure that you're protected from the assumption that "Type1" only ever has one item in the list.

It's definately not the best example, but I'm to sleepy to optimize it.

import csv


def json_to_csv(obj, res):
    for k, v in obj.items():
        if isinstance(v, dict):
            res.append(k)
            json_to_csv(v, res)
        elif isinstance(v, list):
            res.append(k)
            for el in v:
                json_to_csv(el, res)
        else:
            res.append(v)


obj = {
  "Result": {
    "Example 1": {
      "Type1": [
        {
          "Owner": "Name1 Example",
          "Description": "Description1 Example",
          "Email": "example1_email@email.com",
          "Phone": "(123) 456-7890"
        }
      ]
    },
    "Example 2": {
      "Type1": [
        {
          "Owner": "Name2 Example",
          "Description": "Description2 Example",
          "Email": "example2_email@email.com",
          "Phone": "(111) 222-3333"
        }
      ]
    }
  }
}

with open("out.csv", "w+") as f:
    writer = csv.writer(f)
    writer.writerow(["Address","Type","Owner","Description","Email","Phone"])
    for k, v in obj["Result"].items():
        row = [k]
        json_to_csv(v, row)
        writer.writerow(row)

Figured it out!

I changed the f.writerow function to the following:

for key in x["Result"]:
    type = "Type1"
    f.writerow([key,
                type,
                x["Result"][key]["Type1"][0]["Owner"],
                x["Result"][key]["Type1"][0]["Email"]])
                ...

This allowed me reference the keys within the object. Hopefully this helps someone down the line!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM