將包含另一個詞典列表的詞典列表轉換為數據框

Question

我試圖尋找解決方案，但無法獲得1。我從python中的api獲得以下輸出。

insights = [ <Insights> {
    "account_id": "1234",
    "actions": [
        {
            "action_type": "add_to_cart",
            "value": "8"
        },
        {
            "action_type": "purchase",
            "value": "2"
        }
    ],
    "cust_id": "xyz123",
    "cust_name": "xyz",
}, <Insights> {
    "account_id": "1234",
    "cust_id": "pqr123",
    "cust_name": "pqr",
},  <Insights> {
    "account_id": "1234",
    "actions": [
        {
            "action_type": "purchase",
            "value": "45"
        }
    ],
    "cust_id": "abc123",
    "cust_name": "abc",
 }
 ]

我想要這樣的數據框

- account_id    add_to_cart purchase    cust_id cust_name
- 1234                    8        2    xyz123  xyz
- 1234                                  pqr123  pqr
- 1234                            45    abc123  abc

當我使用以下

> insights_1 = [x for x in insights]

> df = pd.DataFrame(insights_1)

我得到以下

- account_id                                       actions  cust_id cust_name
- 1234  [{'value': '8', 'action_type': 'add_to_cart'},{'value': '2', 'action_type': 'purchase'}]                                    xyz123  xyz
- 1234                                              NaN     pqr123  pqr
- 1234  [{'value': '45', 'action_type': 'purchase'}]        abc123  abc

我該如何前進？

Answer 1

這是一種解決方案。

df = pd.DataFrame(insights)

parts = [pd.DataFrame({d['action_type']: d['value'] for d in x}, index=[0])
         if x == x else pd.DataFrame({'add_to_cart': [np.nan], 'purchase': [np.nan]})
         for x in df['actions']]

df = df.drop('actions', 1)\
       .join(pd.concat(parts, axis=0, ignore_index=True))

print(df)

  account_id cust_id cust_name add_to_cart purchase
0       1234  xyz123       xyz           8        2
1       1234  pqr123       pqr         NaN      NaN
2       1234  abc123       abc         NaN       45

說明

利用pandas將字典的外部列表讀入數據框。
對於內部詞典，請使用列表理解和字典理解。
通過測試列表理解中的相等性來計算nan值。
連接零件並將其連接到原始數據框。

說明-詳細

這詳細說明了parts的構造和使用：

取df['actions']每個條目； 每個條目將是詞典列表 。
在for循環中逐個（即逐行）迭代它們。
在else部分說，“如果是np.nan [即空]，然后返回的數據幀nan的”。 if部分獲取字典列表，並為每行創建一個微型數據框。
然后，我們使用下一部分連接這些小型詞典，每行一個，並將它們連接到原始數據框。

Answer 2

我認為使用apply to your df將是一個選擇。 首先，我將NaN替換為空列表：

df['actions'][df['actions'].isnull()] = df['actions'][df['actions'].isnull()].apply(lambda x: [])

如果類型為add_to_cart ，則創建一個add_to_cart函數以讀取操作列表，並使用apply創建列：

def add_to_cart(list_action):
    for action in list_action:
        # for each action, see if the key action_type has the value add_to_cart and return the value
        if action['action_type'] == 'add_to_cart':
            return action['value']
    # if no add_to_cart action, then empty
    return ''

df['add_to_cart'] = df['actions'].apply(add_to_cart)

purchase相同的想法：

def purchase(list_action):
    for action in list_action:
        if action['action_type'] == 'purchase':
            return action['value']
    return ''

df['purchase'] = df['actions'].apply(purchase)

然后，您可以根據需要刪除列actions ：

df = df.drop('actions',axis=1)

編輯：定義一個唯一的函數find_action ，然后apply一個參數，例如：

def find_action(list_action, action_type):
    for action in list_action:
        # for each action, see if the key action_type is the one wanted
        if action['action_type'] == action_type:
            return action['value']
    # if not the right action type found, then empty
    return ''
df['add_to_cart'] = df['actions'].apply(find_action, args=(['add_to_cart']))
df['purchase'] = df['actions'].apply(find_action, args=(['purchase']))

將包含另一個詞典列表的詞典列表轉換為數據框

問題描述

2 個解決方案

解決方案1
4 已采納 2018-04-30 20:02:51

解決方案2
1 2018-04-30 19:42:08

將包含另一個詞典列表的詞典列表轉換為數據框

問題描述

2 個解決方案

解決方案1 4 已采納 2018-04-30 20:02:51

解決方案2 1 2018-04-30 19:42:08

解決方案1
4 已采納 2018-04-30 20:02:51

解決方案2
1 2018-04-30 19:42:08