简体   繁体   English

在数据框中转换嵌套字典?

[英]Transforming a Nested Dict in a dataframe?

I have been trying to parse the nested dict in data frame.我一直在尝试解析数据框中的嵌套字典。 I made this df from dict, but couldn't figure out this nested one.我从 dict 制作了这个 df,但无法弄清楚这个嵌套的。

df df

    First   second    third              

 0     1       2      {nested dict}

nested dict:嵌套字典:

   {'fourth': '4', 'fifth': '5', 'sixth': '6'}, {'fourth': '7', 'fifth': '8', 'sixth': '9'}

My Desired output would be:我想要的输出是:

        First   second  fourth   fifth   sixth   fourth   fifth   sixth          

 0     1       2       4         5        6         7       8       9

Edit: original Dict编辑:原始字典

   'archi': [{'fourth': '115',
      'fifth': '-162',
      'sixth': '112'},
     {'fourth': '52',
      'fifth': '42',
      'sixth': ' 32'}]

I can't quit tell the format of the nested dict in the "third" column, but here is what I recommend using Python: Pandas dataframe from Series of dict as a starting point.我无法退出“第三”列中嵌套 dict 的格式,但这是我推荐使用Python 的内容: dict 系列中的 Pandas 数据帧作为起点。 Here is a dict and dataframe which are reproducible:这是一个可重现的字典和数据框:

nst_dict = {'archi': [{'fourth': '115', 'fifth': '-162', 'sixth': '112'},
      {'fourth': '52', 'fifth': '42','sixth': ' 32'}]}

df = pd.DataFrame.from_dict({'First':[1,2], 'Second':[2,3], 
     'third': [nst_dict,nst_dict]})

Then you need to first access the list within the dict, then the items of the list:然后您需要首先访问 dict 中的列表,然后访问列表中的项目:

df.thrd_1 = df.third.apply(lambda x: x['archi']) # convert to list
df.thrd_1a = df.thrd_1.apply(lambda x: x[0]) # access first item
df.thrd_1b = df.thrd_1.apply(lambda x: x[1]) # access second item

out = df.drop('third', axis=1).merge(
    df.thrd_1a.apply(pd.Series).merge(df.thrd_1a.apply(pd.Series),
    left_index=True, right_index=True),
    left_index=True, right_index=True)

print(out)

First  Second fourth_x fifth_x sixth_x fourth_y fifth_y sixth_y
0      1       2      115    -162     112      115    -162     112
1      2       3      115    -162     112      115    -162     112

I will try to clean this up with collections.abc and turn into a function, but this should do the trick for your specific case.我将尝试使用collections.abc清理它并变成一个函数,但这应该可以解决您的具体情况。

A "brute force" approach “蛮力”方法

import pandas as pd
import numpy as np

my_dict = {'Zero': 0, 'First': 1, 'Second': 2,
       'archi': [{'fourth': '115', 'fifth': '-162', 'sixth': '112'},
                {'fourth': '52', 'fifth': '42', 'sixth': ' 32'}]}

data_row=[]
columns = []
for key in my_dict.keys():
    try:
        if len(my_dict[key]):
            for item in my_dict[key]:
                # iterate over nested dicts
                for k, v in item.items():
                    columns.append(k)
                    data_row.append(v)

    except TypeError:
        data_row.append(my_dict[key])
        columns.append(key)

print(columns)
print(data_row)

data = np.array(data_row).reshape(1,9)
df = pd.DataFrame(new_d, columns=columns)
print(df)

Output:输出:

     Zero   First   Second   fourth     fifth   sixth   fourth  fifth   sixth
0       0       1        2      115      -162     112      52      42      32

I created a function using a recursive approach to flatten the dict structure:我使用递归方法创建了一个函数来展平 dict 结构:

original_dict = {'Zero': 0, 'First': 1, 'Second': 2,
       'archi': [{'fourth': '115', 'fifth': '-162', 'sixth': '112'},
                {'fourth': '52', 'fifth': '42', 'sixth': ' 32'}]}

flattened_dict = {}

def flatten(obj, name = ''):
    if isinstance(obj, dict):
        for key, value in obj.items():
            flatten(obj[key], key)
    elif isinstance(obj, list):
        for e in obj:
            flatten(e)
    else:
        flattened_dict[name] = [obj] 

flatten(original_dict)

Then the creation of the dataframe:然后创建数据框:

pd.DataFrame(flattened_dict)

With the following output:具有以下输出:

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM