![](/img/trans.png)
[英]Convert string containing list of dictionaries in DataFrame to list of dictionaries
[英]Convert list of dictionaries containing another list of dictionaries to dataframe
我試圖尋找解決方案,但無法獲得1。我從python中的api獲得以下輸出。
insights = [ <Insights> {
"account_id": "1234",
"actions": [
{
"action_type": "add_to_cart",
"value": "8"
},
{
"action_type": "purchase",
"value": "2"
}
],
"cust_id": "xyz123",
"cust_name": "xyz",
}, <Insights> {
"account_id": "1234",
"cust_id": "pqr123",
"cust_name": "pqr",
}, <Insights> {
"account_id": "1234",
"actions": [
{
"action_type": "purchase",
"value": "45"
}
],
"cust_id": "abc123",
"cust_name": "abc",
}
]
我想要這樣的數據框
- account_id add_to_cart purchase cust_id cust_name
- 1234 8 2 xyz123 xyz
- 1234 pqr123 pqr
- 1234 45 abc123 abc
當我使用以下
> insights_1 = [x for x in insights]
> df = pd.DataFrame(insights_1)
我得到以下
- account_id actions cust_id cust_name
- 1234 [{'value': '8', 'action_type': 'add_to_cart'},{'value': '2', 'action_type': 'purchase'}] xyz123 xyz
- 1234 NaN pqr123 pqr
- 1234 [{'value': '45', 'action_type': 'purchase'}] abc123 abc
我該如何前進?
這是一種解決方案。
df = pd.DataFrame(insights)
parts = [pd.DataFrame({d['action_type']: d['value'] for d in x}, index=[0])
if x == x else pd.DataFrame({'add_to_cart': [np.nan], 'purchase': [np.nan]})
for x in df['actions']]
df = df.drop('actions', 1)\
.join(pd.concat(parts, axis=0, ignore_index=True))
print(df)
account_id cust_id cust_name add_to_cart purchase
0 1234 xyz123 xyz 8 2
1 1234 pqr123 pqr NaN NaN
2 1234 abc123 abc NaN 45
說明
pandas
將字典的外部列表讀入數據框。 nan
值。 說明-詳細
這詳細說明了parts
的構造和使用:
df['actions']
每個條目; 每個條目將是詞典列表 。 for
循環中逐個(即逐行)迭代它們。 else
部分說,“如果是np.nan
[即空],然后返回的數據幀nan
的”。 if
部分獲取字典列表,並為每行創建一個微型數據框。 我認為使用apply
to your df
將是一個選擇。 首先,我將NaN
替換為空列表:
df['actions'][df['actions'].isnull()] = df['actions'][df['actions'].isnull()].apply(lambda x: [])
如果類型為add_to_cart
,則創建一個add_to_cart
函數以讀取操作列表,並使用apply
創建列:
def add_to_cart(list_action):
for action in list_action:
# for each action, see if the key action_type has the value add_to_cart and return the value
if action['action_type'] == 'add_to_cart':
return action['value']
# if no add_to_cart action, then empty
return ''
df['add_to_cart'] = df['actions'].apply(add_to_cart)
purchase
相同的想法:
def purchase(list_action):
for action in list_action:
if action['action_type'] == 'purchase':
return action['value']
return ''
df['purchase'] = df['actions'].apply(purchase)
然后,您可以根據需要刪除列actions
:
df = df.drop('actions',axis=1)
編輯:定義一個唯一的函數find_action
,然后apply
一個參數,例如:
def find_action(list_action, action_type):
for action in list_action:
# for each action, see if the key action_type is the one wanted
if action['action_type'] == action_type:
return action['value']
# if not the right action type found, then empty
return ''
df['add_to_cart'] = df['actions'].apply(find_action, args=(['add_to_cart']))
df['purchase'] = df['actions'].apply(find_action, args=(['purchase']))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.