[英]Transforming a Nested Dict in a dataframe?
I have been trying to parse the nested dict in data frame.我一直在尝试解析数据框中的嵌套字典。 I made this df from dict, but couldn't figure out this nested one.我从 dict 制作了这个 df,但无法弄清楚这个嵌套的。
df df
First second third
0 1 2 {nested dict}
nested dict:嵌套字典:
{'fourth': '4', 'fifth': '5', 'sixth': '6'}, {'fourth': '7', 'fifth': '8', 'sixth': '9'}
My Desired output would be:我想要的输出是:
First second fourth fifth sixth fourth fifth sixth
0 1 2 4 5 6 7 8 9
Edit: original Dict编辑:原始字典
'archi': [{'fourth': '115',
'fifth': '-162',
'sixth': '112'},
{'fourth': '52',
'fifth': '42',
'sixth': ' 32'}]
I can't quit tell the format of the nested dict in the "third" column, but here is what I recommend using Python: Pandas dataframe from Series of dict as a starting point.我无法退出“第三”列中嵌套 dict 的格式,但这是我推荐使用Python 的内容: dict 系列中的 Pandas 数据帧作为起点。 Here is a dict and dataframe which are reproducible:这是一个可重现的字典和数据框:
nst_dict = {'archi': [{'fourth': '115', 'fifth': '-162', 'sixth': '112'},
{'fourth': '52', 'fifth': '42','sixth': ' 32'}]}
df = pd.DataFrame.from_dict({'First':[1,2], 'Second':[2,3],
'third': [nst_dict,nst_dict]})
Then you need to first access the list within the dict, then the items of the list:然后您需要首先访问 dict 中的列表,然后访问列表中的项目:
df.thrd_1 = df.third.apply(lambda x: x['archi']) # convert to list
df.thrd_1a = df.thrd_1.apply(lambda x: x[0]) # access first item
df.thrd_1b = df.thrd_1.apply(lambda x: x[1]) # access second item
out = df.drop('third', axis=1).merge(
df.thrd_1a.apply(pd.Series).merge(df.thrd_1a.apply(pd.Series),
left_index=True, right_index=True),
left_index=True, right_index=True)
print(out)
First Second fourth_x fifth_x sixth_x fourth_y fifth_y sixth_y
0 1 2 115 -162 112 115 -162 112
1 2 3 115 -162 112 115 -162 112
I will try to clean this up with collections.abc
and turn into a function, but this should do the trick for your specific case.我将尝试使用collections.abc
清理它并变成一个函数,但这应该可以解决您的具体情况。
A "brute force" approach “蛮力”方法
import pandas as pd
import numpy as np
my_dict = {'Zero': 0, 'First': 1, 'Second': 2,
'archi': [{'fourth': '115', 'fifth': '-162', 'sixth': '112'},
{'fourth': '52', 'fifth': '42', 'sixth': ' 32'}]}
data_row=[]
columns = []
for key in my_dict.keys():
try:
if len(my_dict[key]):
for item in my_dict[key]:
# iterate over nested dicts
for k, v in item.items():
columns.append(k)
data_row.append(v)
except TypeError:
data_row.append(my_dict[key])
columns.append(key)
print(columns)
print(data_row)
data = np.array(data_row).reshape(1,9)
df = pd.DataFrame(new_d, columns=columns)
print(df)
Output:输出:
Zero First Second fourth fifth sixth fourth fifth sixth
0 0 1 2 115 -162 112 52 42 32
I created a function using a recursive approach to flatten the dict structure:我使用递归方法创建了一个函数来展平 dict 结构:
original_dict = {'Zero': 0, 'First': 1, 'Second': 2,
'archi': [{'fourth': '115', 'fifth': '-162', 'sixth': '112'},
{'fourth': '52', 'fifth': '42', 'sixth': ' 32'}]}
flattened_dict = {}
def flatten(obj, name = ''):
if isinstance(obj, dict):
for key, value in obj.items():
flatten(obj[key], key)
elif isinstance(obj, list):
for e in obj:
flatten(e)
else:
flattened_dict[name] = [obj]
flatten(original_dict)
Then the creation of the dataframe:然后创建数据框:
pd.DataFrame(flattened_dict)
With the following output:具有以下输出:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.