简体   繁体   English

将数据帧转换为包含字典列表的字典

[英]Convert dataframe into dictionary containing list of dictionaries

My dataframe is as shown 我的数据框如图所示

 name    key    value
 john    A223   390309
 jason   B439   230943
 peter   A5388  572039
 john    D23902 238939
 jason   F2390   23930

I want to convert the above generated dataframe into a nested dictionary with list of dictionary in the below shown format. 我想将上面生成的数据帧转换为嵌套字典,其中包含以下所示格式的字典列表。

{'john': [{'key':'A223', 'value':'390309'}, {'key':'A5388', 'value':'572039'}],
 'jason': [{'key':'B439','value':'230943', {'key':'F2390', 'value'2:'23930'}],
 'peter': [{'key':'A5388'  ,'value':'572039'}]}

could some one help with this. 可能有人帮助这个。

Use dictionary comprehension with to_dict : 使用to_dict dictionary comprehension

d = {name:df.loc[df.name==name,['key','value']].to_dict('records') for name in df.name.unique()}

print(d)
{'john': [{'key': 'A223', 'value': 390309}, {'key': 'D23902', 'value': 238939}], 
 'jason': [{'key': 'B439', 'value': 230943}, {'key': 'F2390', 'value': 23930}], 
 'peter': [{'key': 'A5388', 'value': 572039}]}

You can use groupby , apply , iterrows and Series' tolist as below: 您可以使用groupbyapplyiterrowsSeries'tallist ,如下所示:

def f(rows):
      return {rows.iloc[0]['name']: [{'key': row['key'], 'value': row['value']} for _, row in rows.iterrows()]}

df.groupby("name").apply(f).tolist()

Generating the results you want: 生成您想要的结果:

[{'jason': [{'key': 'B439', 'value': '230943'}, {'key': 'F2390', 'value': '23930'}]},
 {'john': [{'key': 'A223', 'value': '390309'}, {'key': 'D23902', 'value': '238939'}]},
 {'peter': [{'key': 'A5388', 'value': '572039'}]}]

Explanation: 说明:

  • With groupby("name") we aggregate all the rows per name 使用groupby("name")我们聚合每个name所有行
  • Then we are applying the function f to each of those groups of rows with apply(f) 然后我们使用apply(f)将函数f应用于每个行组
  • f iterates through those rows with iterrows creating a list of dictionaries with [{'key': row['key'], 'value': row['value']} for _, row in rows.iterrows()] and finally we take just the first row's name with rows.iloc[0]['name'] to create the final dictionary for this name . f迭代遍历那些行,用iterrows创建一个字典列表,其中包含[{'key': row['key'], 'value': row['value']} for _, row in rows.iterrows()] ,最后我们只使用rows.iloc[0]['name']获取第一行的名称,以便为此name创建最终字典。
  • We aggregate all the dictionaries per name with tolist() 我们用tolist()汇总每个name所有词典

try this, 试试这个,

final_dict={}
def dict_make(row):
    m_k= row['name'].values.tolist()[0]
    final_dict[m_k]=  row.set_index('name').to_dict(orient='records')
df.groupby('name').apply(dict_make)
print final_dict

Output: 输出:

{'peter': [{'value': 572039, 'key': 'A5388'}], 
'john': [{'value': 390309, 'key': 'A223'}, {'value': 238939, 'key': 'D23902'}],
'jason': [{'value': 230943, 'key': 'B439'}, {'value': 23930, 'key': 'F2390'}]}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM