简体   繁体   English

从嵌套的JSON嵌套数组中获取键和值

[英]Obtaining keys and values from JSON nested array in nest

First time posting! 第一次发布! I am converting JSON data (dictionary) from a server into a csv file. 我正在将JSON数据(字典)从服务器转换为csv文件。 The keys and values taken are fine apart from the nest "Astronauts", which is an array. 除了作为数组的“宇航员”巢以外,其他所有键和值都很好。 Basically every individual JSON string is a datum that may contains from 0 to an unlimited number of astronauts which features I would like to extract as independent values. 基本上每个JSON字符串都是一个数据,其中可能包含0到不限数量的宇航员,这些特征我想提取为独立值。 For instance something like this: 例如这样的事情:

  • Astronaut1_Spaceships_First: Katabom 宇航员1_太空飞船_第一:卡塔博姆
  • Astronaut1_Spaceships_Second: The Kraken 宇航员1_太空飞船_第二:海妖
  • Astronaut1_name: Jebeddia 宇航员
  • (...) (......)
  • Astronaut2_gender: Hopefully female 宇航员:希望女性

and so on. 等等。 The problem here is that the nest is set as an array and not a dictionary so I do not know what to do. 这里的问题是将嵌套设置为数组而不是字典,所以我不知道该怎么办。 I have tried the dpath library as well as flattering the nest but nothing did change. 我已经尝试了dpath库,以及奉承的巢,但没有任何变化。 Any ideas? 有任何想法吗?

import json
import os
import csv
import datetime
import dpath.util #Dpath library needs to be installed first

datum = {"Mission": "Make Earth Greater Again", "Objective": "Prove Earth is flat", "Astronauts": [{"Spaceships": {"First": "Katabom", "Second": "The Kraken"}, "Name": "Jebeddiah", "Gender": "Hopefully male", "Age": 35, "Prefered colleages": [], "Following missions": [{"Payment_status": "TO BE CONFIRMED"}]}, {"Spaceships": {"First": "The Kraken", "Second": "Minnus I"}, "Name": "Bob", "Gender": "Hopefully female", "Age": 23, "Prefered colleages": [], "Following missions": [{"Payment_status": "TO BE CONFIRMED"}]}]}

#Parsing process
        parsed = json.loads(datum)  #datum is the JSON string retrieved from the server

def flattenjson(parsed, delim):
    val = {}
    for i in parsed.keys():
        if isinstance(parsed[i], dict):
            get = flattenjson(parsed[i], delim)
            for j in get.keys():
                val[i + delim + j] = get[j]
        else:
        val[i] = parsed[i]

    return val
flattened = flattenjson(parsed,"__")

#process of creating csv file
keys=['Astronaut1_Spaceship_First','Astronaut2_Spaceship_Second', 'Astronaut1_Name]  #reduced to 3 keys for this example

 writer = csv.DictWriter(OD, keys ,restval='Null', delimiter=",", quotechar="\"", quoting=csv.QUOTE_ALL, dialect= "excel")
        writer.writerow(flattened)

.

#JSON DATA FROM SERVER
{
"Mission": "Make Earth Greater Again",
"Objective": "Prove Earth is flat",
"Astronauts": [    {
  "Spaceships": {
    "First": "Katabom",
    "Second": "The Kraken"
  },
  "Name": "Jebeddiah",
  "Gender": "Hopefully male",
  "Age": 35,
  "Prefered colleages": [],
  "Following missions": [
    {
      "Payment_status": "TO BE CONFIRMED"
    }
  ]
},
{
  "Spaceships": {
    "First": "The Kraken",
    "Second": "Minnus I"
  },
  "Name": "Bob",
  "Gender": "Hopefully female",
  "Age": 23,
  "Prefered colleages": [],
  "Following missions": [
    {
      "Payment_status": "TO BE CONFIRMED"
    }
  ]
},
  ]
}
]

Firstly, the datum you have defined here is not the datum that would be extracted from the server. 首先,您在此处定义的数据不是将从服务器提取的数据。 The datum from the server would be a string. 来自服务器的数据为字符串。 The datum you have in this program is already processed. 您在该程序中拥有的原点已被处理。 Now, assuming datum to be: 现在,假设基准为:

datum = '{"Mission": "Make Earth Greater Again", "Objective": "Prove Earth is flat", "Astronauts": [{"Spaceships": {"First": "Katabom", "Second": "The Kraken"}, "Name": "Jebeddiah", "Gender": "Hopefully male", "Age": 35, "Prefered colleages": [], "Following missions": [{"Payment_status": "TO BE CONFIRMED"}]}, {"Spaceships": {"First": "The Kraken", "Second": "Minnus I"}, "Name": "Bob", "Gender": "Hopefully female", "Age": 23, "Prefered colleages": [], "Following missions": [{"Payment_status": "TO BE CONFIRMED"}]}]}'

You don't need the the dpath library. 您不需要dpath库。 The problem here is that your json flattener doesn't handle embedded lists. 这里的问题是您的json拼合器不处理嵌入式列表。 Try using the one I've put below. 尝试使用我在下面提出的内容。 Assuming that you want a one line csv file, 假设您要使用一行csv文件,

import json
def flattenjson(data, delim, topname=''):
    """JSON flattener that can handle embedded lists and dictionaries"""
    flattened = {}
    def internalflat(int_data, name=topname):
        if type(int_data) is dict:
            for key in int_data:
                internalflat(int_data[key], name + key + delim)
        elif type(int_data) is list:
            i = 1
            for elem in int_data:
                internalflat(elem, name + str(i) + delim)
                i += 1
        else:
            flattened[name[:-len(delim)]] = int_data
    internalflat(data)
    return flattened
#If you don't want mission or objective in csv file
flattened_astronauts = flattenjson(json.loads(datum)["Astronauts"], "__", "Astronaut")
keys = flattened_astronauts.keys().sort()
writer = csv.DictWriter(OD, keys ,restval='Null', delimiter=",", quotechar="\"", quoting=csv.QUOTE_ALL, dialect= "excel")
writer.writerow(flattened_astronauts)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM