I have a requirement to create a nested dictionary from a Pandas DataFrame.
Below is an example dataset in CSV format:
hostname,nic,vlan,status
server1,eth0,100,enabled
server1,eth2,200,enabled
server2,eth0,100
server2,eth1,100,enabled
server2,eth2,200
server1,eth1,100,disabled
Once the CSV is imported as a DataFrame I have:
>>> import pandas as pd
>>>
>>> df = pd.read_csv('test.csv')
>>>
>>> df
hostname nic vlan status
0 server1 eth0 100 enabled
1 server1 eth2 200 enabled
2 server2 eth0 100 NaN
3 server2 eth1 100 enabled
4 server2 eth2 200 NaN
5 server1 eth1 100 disabled
The output nested dictionary/JSON needs to group by the first two columns (hostname and nic), for example:
{
"hostname": {
"server1": {
"nic": {
"eth0": {
"vlan": 100,
"status": "enabled"
},
"eth1": {
"vlan": 100,
"status": "disabled"
},
"eth2": {
"vlan": 200,
"status": "enabled"
}
}
},
"server2": {
"nic": {
"eth0": {
"vlan": 100
},
"eth1": {
"vlan": 100,
"status": "enabled"
},
"eth2": {
"vlan": 200
}
}
}
}
}
I need to account for:
I have looked at groupby and multiindex in the Pandas documentation by as a newcomer I have got stuck.
Any help is appreciated on the best method to achieve this.
It may help to group the df first : df_new = df.groupby(["hostname", "nice"], as_index=False)
- note, as_index=False
preserves the dataframe format.
You can then use df_new.to_json(orient = 'records', lines=True)
to convert your df to json format (as jtweeder mentions in comments). Once you get desired format and would like to write out, you can do something like:
with open('temp.json', 'w') as f: f.write(df_new.to_json(orient='records', lines=True))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.