Split values in a column and create a matrix of column names

Question

I would like to have a solution for my problem with minimum effort.

Question:

I have a list of values with delimited values. I would like to split and arrange each values at the appropriate cell. Column Heading should be also populated.

Input

A,B,C
C,D,A,E
D,E

Output

+-------+-------+-------+-------+-------+
| VLUE1 | VLUE2 | VLUE3 | VLUE4 | VLUE5 |
+-------+-------+-------+-------+-------+
| A     | B     | C     |       |       |
| A     |       | C     | D     | E     |
|       |       |       | D     | E     |
+-------+-------+-------+-------+-------+

I have a solution using sorting, key value pair in python and iterating but i would like to know is there any shortcut using Python packages or panda?

-Sam

Answer 1

Starting with a series -

s

0      A,B,C
1    C,D,A,E
2        D,E
dtype: object

Convert s to a OHE matrix using get_dummies -

x = s.str.get_dummies(sep=',')
x

   A  B  C  D  E
0  1  1  1  0  0
1  1  0  1  1  1
2  0  0  0  1  1

Use this to create a new dataframe using repeat and array multiplication -

v = x.mul(x.columns).values
c = np.arange(1, x.shape[1] + 1)

df = pd.DataFrame(v, columns=c).add_prefix('VLUE') 
df

  VLUE1 VLUE2 VLUE3 VLUE4 VLUE5
0     A     B     C            
1     A           C     D     E
2                       D     E

Answer 2

get_dummies is the fastest as of I know, here's my try with value_counts and masking ie

mask = df[0].str.split(',',expand=True).apply(pd.value_counts,1).notna()

pd.DataFrame(np.where(mask,mask.columns,'')).add_prefix('VALU')


  VALU0 VALU1 VALU2 VALU3 VALU4
0     A     B     C            
1     A           C     D     E
2                       D     E

Split values in a column and create a matrix of column names

Question

Input

Output

2 answers

solution1
2 ACCPTED 2017-12-24 14:27:54

solution2
2 2017-12-24 15:06:24

Split values in a column and create a matrix of column names

Question

Input

Output

2 answers

solution1 2 ACCPTED 2017-12-24 14:27:54

solution2 2 2017-12-24 15:06:24

solution1
2 ACCPTED 2017-12-24 14:27:54

solution2
2 2017-12-24 15:06:24