簡體   English   中英

如何從 python 中字典列表的特定鍵創建 dataframe?

[英]How to create a dataframe from specific keys of a List of dictionaries in python?

我有一本這樣的字典

data_dict = {'Company': [
                    {
                        'Date': [{'L_End': {'@value': '2013-03-31'},
                                 'L_Start': {'@value': '2006-10-01'},
                                 'Type': {'@value': 'L'}},
                                {'O_Start': {'@value': '2006-10-01'},
                                 'Type': {'@value': 'O'}}],
                       'GeoLoc': {'Location': {'AddrLn1': 'HIGHLAND ROAD',
                                               'Country': 'ENGLAND',
                                               'County': 'HAMPSHIRE',
                                               'PostCode': 'PO49HU',
                                               'Town': 'SOUTHSEA',
                                               'UPRN': Decimal('1775')}},
                       'Name': 'Courtney',
                       'OrgId': {'@assigningAuthorityName': 'ABCDE',
                                 '@extension': 'a12cde',
                                 '@root': '2.16.840.1.1'},
                       'Status': {'@value': 'Active'},
                   },
                  {
                       'Date': [{'L_End': {'@value': '2013-03-31'},
                                 'L_Start': {'@value': '2006-10-01'},
                                 'Type': {'@value': 'L'}},
                                {'O_Start': {'@value': '2006-10-01'},
                                 'Type': {'@value': 'O'}}],
                       'GeoLoc': {'Location': {'AddrLn1': 'VILLIERS ROAD',
                                               'Country': 'ENGLAND',
                                               'County': 'HAMPSHIRE',
                                               'PostCode': 'PO52HG',
                                               'Town': 'SOUTHSEA'}},
                       'Name': 'MERLIN',
                       'OrgId': {'@assigningAuthorityName': 'ABCDE',
                                 '@extension': 'b12cde',
                                 '@root': '2.16.840.1.1'},
                       'Status': {'@value': 'Active'}
                   }
                  ]
}

我預期的 output 是請點擊圖片

我嘗試使用下面的代碼執行此操作,並且效果很好,但我想看看我是否可以讀取特定數據,而不是將每一行多次拆分為列。

df = pd.DataFrame.from_dict(data_dict)
df1 = pd.DataFrame(df.Company.values.tolist())

df2 = pd.concat([df1, df1.OrgId.apply(lambda x: pd.Series(x))], axis=1)
df2 = df2.drop(['OrgId', '@root', '@assigningAuthorityName'], axis=1)
df2 = df2.rename(columns={"@extension": "OrgId"})

df3 = pd.concat([df2, df2.Date.apply(lambda x: pd.Series(x))], axis=1)
df3 = df3.drop('Date', axis=1)

df4 = pd.concat([df3, df3[0].apply(lambda x: pd.Series(x))], axis=1)
df4 = df4.drop([0, 'Type'], axis=1)
....Followed the same until I got all the columns as I needed them to be

df11 = pd.concat([df10, df10.Location.apply(lambda x: pd.Series(x))], axis=1)
df11 = df11.drop('Location', axis=1)

final_df = df11.copy()
print(final_df)

最好的方法是什么?

嘗試:

lst = [
    {
        "Name": c.get("Name"),
        "OrgId": c.get("OrgId", {}).get("@extension"),
        "L_Start": next((d for d in c.get("Date", []) if "L_Start" in d), {})
        .get("L_Start", {})
        .get("@value"),
        "L_End": next((d for d in c.get("Date", []) if "L_End" in d), {})
        .get("L_End")
        .get("@value"),
        "Status": c.get("Status", {}).get("@value"),
        **c.get("GeoLoc", {}).get("Location", {}),
    }
    for c in data_dict["Company"]
]

df = pd.DataFrame(lst)
print(df)

印刷:

       Name   OrgId     L_Start       L_End  Status        AddrLn1  Country     County PostCode      Town  UPRN
0  Courtney  a12cde  2006-10-01  2013-03-31  Active  HIGHLAND ROAD  ENGLAND  HAMPSHIRE   PO49HU  SOUTHSEA  1775
1    MERLIN  b12cde  2006-10-01  2013-03-31  Active  VILLIERS ROAD  ENGLAND  HAMPSHIRE   PO52HG  SOUTHSEA   NaN

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM