简体   繁体   English

大熊猫将列值拆分为单独的列

[英]pandas split column values into separate columns

This is related to: exploding a pandas dataframe column 这与以下内容有关: 分解pandas数据框列

Here's my dataframe: 这是我的数据框:

import pandas as pd
import numpy as np
d = {'id': [1, 1, 1, 2, 2, 2], 'data': [{'foo':True}, {'foo':False, 'bar':True}, {'foo':True, 'bar':False, 'baz':True}, {'foo':False}, {'foo':False, 'bar':False}, {'foo':False, 'bar':True, 'baz':False}]}
df = pd.DataFrame(data=d)
df

I'd like to create a new column for each value in column data with the relevant True and False values. 我想使用相关的TrueFalse值为列data每个值创建一个新列。 (and np.nan for any null values). (对于任何空值, np.nan )。

My new dataframe would look like: 我的新数据框如下所示:

a = {'id': [1, 1, 1, 2, 2, 2], 'data': [{'foo':True}, {'foo':False, 'bar':True}, {'foo':True, 'bar':False, 'baz':True}, {'foo':False}, {'foo':False, 'bar':False}, {'foo':False, 'bar':True, 'baz':False}], 'foo':[True, False, True, False, False, False], 'bar':[np.nan, True, False, np.nan, False, True], 'baz':[np.nan, np.nan, True, np.nan, np.nan, False] }
df1 = pd.DataFrame(data=a)
df1

I'm not sure if this can be achieved with Series.str.get_dummies as I'm not sure how to map the True and False values. 我不确定用Series.str.get_dummies是否可以实现,因为我不确定如何映射TrueFalse值。 Appreciate any help! 感谢任何帮助!

Listify the column to get a list of records, then convert it to a DataFrame: 列出列以获取记录列表,然后将其转换为DataFrame:

# pd.concat([df, pd.DataFrame(df['data'].tolist())], axis=1)
df.join(pd.DataFrame(df['data'].tolist()))

   id                                       data    bar    baz    foo
0   1                              {'foo': True}    NaN    NaN   True
1   1                {'foo': False, 'bar': True}   True    NaN  False
2   1   {'foo': True, 'bar': False, 'baz': True}  False   True   True
3   2                             {'foo': False}    NaN    NaN  False
4   2               {'foo': False, 'bar': False}  False    NaN  False
5   2  {'foo': False, 'bar': True, 'baz': False}   True  False  False

If the "data" column is not desired in the output, you can pop it before expanding: 如果输出中不需要“数据”列,则可以在展开之前将其pop

df.join(pd.DataFrame(df.pop('data').tolist()))

   id    bar    baz    foo
0   1    NaN    NaN   True
1   1   True    NaN  False
2   1  False   True   True
3   2    NaN    NaN  False
4   2  False    NaN  False
5   2   True  False  False

Reference: Convert a list of dictionaries to pandas DataFrame 参考: 将字典列表转换为pandas DataFrame

I am using from_records 我正在使用from_records

pd.DataFrame.from_records(d['data'],index=d['id'])
     bar    baz    foo
1    NaN    NaN   True
1   True    NaN  False
1  False   True   True
2    NaN    NaN  False
2  False    NaN  False
2   True  False  False

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM