Pandas - 将带有字典的列拆分为带有键和值的两列

Question

This is my column:这是我的专栏：

transcript["value"][1:4]

1    {'offer id': '0b1e1539f2cc45b7b9fa7c272da2e1d7'}
2    {'offer id': '2906b810c7d4411798c6938adc9daaa5'}
3    {'offer id': 'fafdcd668e3743c1bb461111dcafc2a4'}

What I try to achieve is this:我试图实现的是：

       type                          offer_id
0  offer_id  0b1e1539f2cc45b7b9fa7c272da2e1d7
1  offer_id  2906b810c7d4411798c6938adc9daaa5
2  offer_id  fafdcd668e3743c1bb461111dcafc2a4

I tried to convert it into an str and then split it, this this seems error prone and actually did not work at all:我尝试将其转换为str然后拆分它，这似乎容易出错，实际上根本不起作用：

transcript["value"].str.split(":")

Does anyone know how to achieve this?有谁知道如何实现这一目标？ Preferably something that could handle multiple dictionaries in one column?最好是可以在一列中处理多个字典的东西？

Answer 1

You could do:你可以这样做：

import pandas as pd

transcript = pd.DataFrame([
    [{'offer_id': '0b1e1539f2cc45b7b9fa7c272da2e1d7'}],
    [{'offer_id': '2906b810c7d4411798c6938adc9daaa5'}],
    [{'offer_id': 'fafdcd668e3743c1bb461111dcafc2a4'}]
], columns=['value'])


res = pd.DataFrame([{'type' : key, 'offer_id' : value } for d in transcript['value'].tolist() for key, value in d.items()])
print(res)

Output Output

       type                          offer_id
0  offer_id  0b1e1539f2cc45b7b9fa7c272da2e1d7
1  offer_id  2906b810c7d4411798c6938adc9daaa5
2  offer_id  fafdcd668e3743c1bb461111dcafc2a4

Answer 2

The approach used in the previous response can be changed to be used for multiple dictionary items in a column like this:可以将先前响应中使用的方法更改为用于列中的多个字典项，如下所示：

import pandas as pd

data = [[[{'offer id': '0b1e1539f2cc45b7b9fa7c272da2e1d7'}, {'abc': '123'}]],
       [[{'offer id': '2906b810c7d4411798c6938adc9daaa5'}, {'def': '456'}]],
       [[{'offer id': 'fafdcd668e3743c1bb461111dcafc2a4'}, {'ghi': '789'}]]]

df = pd.DataFrame(data, columns = ['Values'])
df = pd.DataFrame([df.Values[0], df.Values[1]], columns = ['dict1','dict2'])

df1 = pd.DataFrame([{'key1': key, 'value1': value } for item in df['dict1'].tolist() 
    for key, value in item.items()])

df2 = pd.DataFrame([{'key2': key, 'value2': value } for item in df['dict2'].tolist() 
    for key, value in item.items()])

pd.concat([df1,df2], axis = 1)

Output: Output：

      key1                    value1               key2  value2
0   offer id    0b1e1539f2cc45b7b9fa7c272da2e1d7    abc   123
1   offer id    2906b810c7d4411798c6938adc9daaa5    def   456

Answer 3

You can use:您可以使用：

df = df['value'].apply(lambda x: pd.Series(*x.items()))
df.columns = ['type', 'offer_id']

Output: Output：

       type                          offer_id
0  offer_id  0b1e1539f2cc45b7b9fa7c272da2e1d7
1  offer_id  2906b810c7d4411798c6938adc9daaa5
2  offer_id  fafdcd668e3743c1bb461111dcafc2a4

If keys are the same as in your case:如果密钥与您的情况相同：

df['offer_id'] = df['value'].str.get('offer_id')
df['type'] = 'offer_id'

Pandas - 将带有字典的列拆分为带有键和值的两列

问题描述

3 个解决方案

解决方案1
1 已采纳 2020-12-24 07:32:58

解决方案2
1 2020-12-24 08:52:57

解决方案3
0 2021-01-21 21:56:44

Pandas - 将带有字典的列拆分为带有键和值的两列

问题描述

3 个解决方案

解决方案1 1 已采纳 2020-12-24 07:32:58

解决方案2 1 2020-12-24 08:52:57

解决方案3 0 2021-01-21 21:56:44

解决方案1
1 已采纳 2020-12-24 07:32:58

解决方案2
1 2020-12-24 08:52:57

解决方案3
0 2021-01-21 21:56:44