简体   繁体   中英

Conversation of a nested JSON object to csv using python

 {
        "_index" : "my-index-000001",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "region" : "USA",
          "manager" : {
            "age" : 3,
            "name" : {
             "First_name":[   
                {"first" : "Joh"},
                {"first" : "Lion"},
             ],
            "Last_name" :[
              {"last" : "Johm"},
              {"last" : "Smihg"}
            ]
          }
        }
        } 
      }

I am trying this for so long. Please help me. Solution format:

region First_name.first    Last_name.last
USA       Joh                     Johm
USA       lion                    Smihg

You can try:

df = pd.json_normalize(d['_source']).drop('manager.age', 1).set_index(['region']).apply(pd.Series.explode).reset_index()
df['manager.name.First_name'] = df['manager.name.First_name'].str['first']
df['manager.name.Last_name'] = df['manager.name.Last_name'].str['last']

You could use pd.json_normalize to extract the relevant data, and concat to the dataframe:

df = pd.DataFrame()
df = pd.concat([df, pd.json_normalize(data['_source']['manager']['name']['First_name'])], axis=1)
df = pd.concat([df, pd.json_normalize(data['_source']['manager']['name']['Last_name'])], axis=1)
df['region'] = data['_source']['region']

Output df is

    first   last    region
0   Joh     Johm    USA
1   Lion    Smihg   USA

However, the codes will be simpler if your raw data is of the following format:

data = {
        "_index" : "my-index-000001",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "region" : "USA",
          "manager" : {
            "age" : 3,
            "name" : [
                { "First_name": {"first" : "Joh"},
                  "Last_name" : {"last" : "Johm"} },
                { "First_name": {"first" : "Lion"},
                  "Last_name" : {"last" : "Smihg"} }
                 ]
          }
        } 
      }

df = pd.json_normalize(data['_source']['manager']['name'])

Output

  First_name.first  Last_name.last
0           Joh     Johm
1           Lion    Smihg

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM