简体   繁体   中英

Pandas Data frame from complex data

I have a Dataframe which has data as below

                cert                              meta
      {"alternate_names": [                  {"asset_name": "",
                                      "audience": "External",
        "asset_name": "",              "automation_utility": "",
        "audience": "External",             "delegate_owner": "",
        "automation_utility": "",               "environment": dev
        "delegate_owner": "",               "l2_group_email": null,
        "environment": dev              "l3_group_email": null,
        "l2_group_email": null,             "requestor_email": "",
        "l3_group_email": null,             "support_email": "",
        "requestor_email": "",              "tech_delegate_email": null,
        "support_email": "",                "tech_owner_email": null
        "tech_delegate_email": null,            }
        "tech_owner_email": null    
    }   
       cert does not exists                 cert does not exists
       cert does not exists                 cert does not exists

I checked the datatype of the column and it shows object.I need to create a Dataframe out of status,support_email but not all rows have similar values.

In case the status does not exists need to show null.

Things I tried -:

df = pd.DataFrame(data)
df["cert"] = df["cert"].apply(lambda x : dict(eval(x)) )
df2 = df["cert"].apply(pd.Series )
print(df) 

Can someone please guide me through this.

it looks like you have (mangled?) JSON content in your dataframe. You might be able to parse this with the Python JSON library and make it into a dictionary. Then, you could use each dictionary to load the status and support_email into a dataframe.

Please see example below, where I have taken one cell of the meta column of your example dataframe, corrected for JSON errors, then ran it through the JSON loader.

import json

s = '''
{"asset_name": "",
"audience": "External",
"automation_utility": "",
"delegate_owner": "",
"environment": "dev",
"l2_group_email": null,
"l3_group_email": null,
"requestor_email": "",
"support_email": "",
"tech_delegate_email": null,
"tech_owner_email": null,
"tech_delegate_email": null
}
'''

d1 = json.loads(s)
print(d1['environment'])
# dev

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM