I have a data frame like:
df = pd.DataFrame({'Geography': ['Geog1', 'Geog1', 'Geog1', 'Geog1','Geog2', 'Geog2','Geog2', 'Geog2'],
'Goal': ['G1', 'G1', 'G2', 'G2','G1', 'G1', 'G2', 'G2'],
'Indicator': ['G1I1', 'G1I2', 'G2I1', 'G2I2','G1I1', 'G1I2', 'G2I1', 'G2I2'],
'Year': [2016, 2016, 2016, 2016,2016, 2016, 2016, 2016]
'Data': [3, 5, 2, 6,7, 4, 6, 6]})
and I want to convert it to a nested dictionary like:
[{'Geography': Geog1, 'Info': [{'Goal': 'G1 ', 'Indicators': [{'Indicator': 'G1I1', 'dataYears': [{'Year': 2016, 'Data': 3}]}, {'Indicator': 'G1I2', 'dataYears': [{'Year': 2016, 'Data': 15.0, }, {'Year': 2011, 'Data': 21.0}]....
I've managed to do this with the following (highly inefficient code):
j = (df.groupby(['Geography','Goal','Indicator'])
.apply(lambda x: x[['Year','Data']].to_dict('r'))
.reset_index()
.rename(columns={0:'dataYear'}))
j = (j.groupby(['Geography','Goal'])
.apply(lambda x: x[['Indicator','dataYear']].to_dict('r'))
.reset_index()
.rename(columns={0:'Indicators'}))
j = (j.groupby(['Geography'])
.apply(lambda x: x[['Goal','Indicators']].to_dict('r'))
.reset_index()
.rename(columns={0:'Goals'})
.to_dict('r'))
My question is: does anyone know a way to do this more efficiently? I have seen answers elsewhere but they typically create a new nested level for each new column but I want to include multiple columns in some levels of the dictionary (eg, Year, data).
您可以通过以下方式轻松修复它:
x = [df.to_dict()] #create a list x whose content is the dictionary of you dataframe
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.