Pandas DataFrame：将CSV列传播到多个列

Question

I have a pandas DataFrame 我有一个熊猫DataFrame

>>> import pandas as pd
>>> df = pd.DataFrame([['a', 2, 3], ['a,b', 5, 6], ['c', 8, 9]])
     0  1  2
0    a  2  3
1  a,b  5  6
2    c  8  9

I want to spread the first column to n columns (where n is the number of unique, comma-separated values, in this case 3). 我想将第一列扩展为n列（其中n是唯一的，用逗号分隔的值的数量，在这种情况下为3）。 Each of the resulting columns shall be 1 if the value is present, and 0 else. 如果存在该值，则每个结果列应为1，否则为0。 Expected result is: 预期结果是：

   1  2  a  c  b
0  2  3  1  0  0
1  5  6  1  0  1
2  8  9  0  1  0

I came up with the following code, but it seems a bit circuitous to me. 我想出了以下代码，但对我来说似乎有点circuit回。

>>> import re
>>> dfSpread = pd.get_dummies(df[0].str.split(',', expand=True)).\
        rename(columns=lambda x: re.sub('.*_','',x))
>>> pd.concat([df.iloc[:,1:], dfSpread], axis = 1)

Is there a built-in function that does just that that I wasn't able to find? 是否有内置函数可以执行我找不到的功能？

Answer 1

Using get_dummies 使用get_dummies

df.set_index([1,2])[0].str.get_dummies(',').reset_index()
Out[229]: 
   1  2  a  b  c
0  2  3  1  0  0
1  5  6  1  1  0
2  8  9  0  0  1

Answer 2

You can use pop + concat here for an alternative version of Wen's answer. 您可以在此处使用pop + concat作为Wen答案的替代版本。

pd.concat([df, df.pop(df.columns[0]).str.get_dummies(sep=',')], axis=1)

   1  2  a  b  c
0  2  3  1  0  0
1  5  6  1  1  0
2  8  9  0  0  1

Pandas DataFrame：将CSV列传播到多个列

问题描述

2 个解决方案

解决方案1
4 已采纳 2018-10-24 15:56:58

解决方案2
2 2018-10-24 16:07:43

Pandas DataFrame：将CSV列传播到多个列

问题描述

2 个解决方案

解决方案1 4 已采纳 2018-10-24 15:56:58

解决方案2 2 2018-10-24 16:07:43

解决方案1
4 已采纳 2018-10-24 15:56:58

解决方案2
2 2018-10-24 16:07:43