簡體   English   中英

來自字典的Pandas DataFrame,帶有嵌套的字典列表

[英]Pandas DataFrame from dictionary with nested lists of dictionaries

my_dict = { 'company_a': [],
            'company_b': [ {'gender': 'Male',
                            'investor': True,
                            'name': 'xyz',
                            'title': 'Board Member'} ],
            'company_c': [],
            'company_m': [ {'gender': 'Male',
                            'investor': None,
                            'name': 'abc',
                            'title': 'Advisor'}, 
                            {'gender': 'Male',
                            'investor': None,
                            'name': 'opq',
                            'title': 'Advisor'} ],
            'company_x': [],
            'company_y': [] }

如何將上述Python字典轉換為帶有以下列的Pandas數據框: company, gender, investor, name, title

company將由my_dict的頂級鍵my_dict 其他列將填充數組中字典中的值。

我已經嘗試過pd.DataFrame.from_dict(my_dict, orient='index') ,但它沒有給我我想要的東西。

此版本使用None填充所有缺失值:

data = {'company': [], 'gender': [], 'investor': [], 'name': [], 'title': []}
for k, v in my_dict.items():
    for entry in v:
        data['company'].append(k)
    if not v:
        data['company'].append(k)
    for name in ['gender', 'investor', 'name', 'title']:
        has_entry = False
        for entry in v:
            has_entry = True
            data[name].append(entry.get(name))
        if not has_entry:
            data[name].append(None)
df = pd.DataFrame(data)
print(df)

輸出:

     company gender investor  name         title
0  company_a   None     None  None          None
1  company_y   None     None  None          None
2  company_b   Male     True   xyz  Board Member
3  company_c   None     None  None          None
4  company_x   None     None  None          None
5  company_m   Male     None   abc       Advisor
6  company_m   Male     None   opq       Advisor

您也可以用NaN替換所有None

print(df.fillna(np.nan))

輸出:

     company gender investor name         title
0  company_a    NaN      NaN  NaN           NaN
1  company_y    NaN      NaN  NaN           NaN
2  company_b   Male     True  xyz  Board Member
3  company_c    NaN      NaN  NaN           NaN
4  company_x    NaN      NaN  NaN           NaN
5  company_m   Male      NaN  abc       Advisor
6  company_m   Male      NaN  opq       Advisor

有點混亂,但這很靈活,取決於嵌套字典中的屬性,並將公司放在他們自己的列中。

df = pd.DataFrame(columns = ['company'])
i = 0

for company in my_dict:
    for nested_dict in my_dict[company]:
        df.loc[i,'company'] = company
        for attribute in nested_dict.keys():
            df.loc[i, attribute] = nested_dict[attribute]
        i += 1

輸出:

Out[46]:
    company     name  gender  title         investor
0   company_m   abc   Male    Advisor       NaN
1   company_m   opq   Male    Advisor       NaN
2   company_b   xyz   Male    Board Member  True

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM