[英]Efficient way to create nested dictionary using pandas dataframe colums as keys
我有一個數據框,如:
df = pd.DataFrame({'Geography': ['Geog1', 'Geog1', 'Geog1', 'Geog1','Geog2', 'Geog2','Geog2', 'Geog2'],
'Goal': ['G1', 'G1', 'G2', 'G2','G1', 'G1', 'G2', 'G2'],
'Indicator': ['G1I1', 'G1I2', 'G2I1', 'G2I2','G1I1', 'G1I2', 'G2I1', 'G2I2'],
'Year': [2016, 2016, 2016, 2016,2016, 2016, 2016, 2016]
'Data': [3, 5, 2, 6,7, 4, 6, 6]})
我想將其轉換為嵌套字典,例如:
[{'Geography': Geog1, 'Info': [{'Goal': 'G1 ', 'Indicators': [{'Indicator': 'G1I1', 'dataYears': [{'Year': 2016, 'Data': 3}]}, {'Indicator': 'G1I2', 'dataYears': [{'Year': 2016, 'Data': 15.0, }, {'Year': 2011, 'Data': 21.0}]....
我設法用以下(效率極低的代碼)做到了這一點:
j = (df.groupby(['Geography','Goal','Indicator'])
.apply(lambda x: x[['Year','Data']].to_dict('r'))
.reset_index()
.rename(columns={0:'dataYear'}))
j = (j.groupby(['Geography','Goal'])
.apply(lambda x: x[['Indicator','dataYear']].to_dict('r'))
.reset_index()
.rename(columns={0:'Indicators'}))
j = (j.groupby(['Geography'])
.apply(lambda x: x[['Goal','Indicators']].to_dict('r'))
.reset_index()
.rename(columns={0:'Goals'})
.to_dict('r'))
我的問題是:有沒有人知道更有效地做到這一點的方法? 我在其他地方看到過答案,但它們通常為每個新列創建一個新的嵌套級別,但我想在字典的某些級別(例如,年份、數據)中包含多個列。
您可以通過以下方式輕松修復它:
x = [df.to_dict()] #create a list x whose content is the dictionary of you dataframe
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.