[英]Create new column using keys pair value from a dataframe column
I have a data frame with many column.我有一个包含许多列的数据框。 One of the column is named 'attributes' and in it has a list of dictionary with keys and values.其中一列名为“属性”,其中包含一个包含键和值的字典列表。 I want to extract each keys and it values to it own column.我想提取每个键并将其值提取到它自己的列中。 This is what the data frame look like这是数据框的样子
The following will add the dictionary keys as additional columns, keeping the attributes
column in the dataframe:以下将字典键添加为附加列,将attributes
列保留在数据框中:
df = pd.concat([df, df["attributes"].apply(pd.Series)], axis=1)
For the nested dictionaries, trying this simple example worked for me (here the initial column of dictionaries is colC
, with the nested dictionaries in foo
):对于嵌套字典,尝试这个简单的示例对我有用(这里字典的初始列是colC
,嵌套字典在foo
中):
import pandas as pd
df = pd.DataFrame(
{
'colA': {0: 7, 1: 2, 2: 5, 3: 3, 4: 5},
'colB': {0: 7, 1: 8, 2: 10, 3: 2, 4: 5},
'colC': {
0: {'foo': {"A": 5, "B": 6, "C": 9}, 'bar': 182, 'baz': 148},
1: {'bar': 103, 'baz': 155},
2: {'foo': 165, 'bar': 184, 'baz': 170},
3: {'foo': 121, 'bar': 151, 'baz': 187},
4: {'foo': 137, 'bar': 199, 'baz': 108},
},
}
)
df = pd.concat([df, df["colC"].apply(pd.Series)], axis=1)
# colA colB colC foo bar baz
#0 7 7 {'foo': {'A': 5, 'B': 6, 'C': 9}, 'bar': 182, 'baz': 148} {'A': 5, 'B': 6, 'C': 9} 182.0 148.0
#1 2 8 {'bar': 103, 'baz': 155} NaN 103.0 155.0
#2 5 10 {'foo': 165, 'bar': 184, 'baz': 170} 165 184.0 170.0
#3 3 2 {'foo': 121, 'bar': 151, 'baz': 187} 121 151.0 187.0
#4 5 5 {'foo': 137, 'bar': 199, 'baz': 108} 137 199.0 108.0
df = pd.concat([df, df["foo"].apply(pd.Series)], axis=1)
# colA colB colC foo bar baz 0 A B C
#0 7 7 {'foo': {'A': 5, 'B': 6, 'C': 9}, 'bar': 182, 'baz': 148} {'A': 5, 'B': 6, 'C': 9} 182.0 148.0 NaN 5.0 6.0 9.0
#1 2 8 {'bar': 103, 'baz': 155} NaN 103.0 155.0 NaN NaN NaN NaN
#2 5 10 {'foo': 165, 'bar': 184, 'baz': 170} 165 184.0 170.0 165.0 NaN NaN NaN
#3 3 2 {'foo': 121, 'bar': 151, 'baz': 187} 121 151.0 187.0 121.0 NaN NaN NaN
#4 5 5 {'foo': 137, 'bar': 199, 'baz': 108} 137 199.0 108.0 137.0 NaN NaN NaN
There is the column 0
which appears because of the "empty" rows, but this should not be a problem.由于“空”行,出现了第0
列,但这应该不是问题。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.