How to covert dataframe with variable column size per row?

Question

I have the following DataFrame:

ID      Code
5966856 A
5966856 B
5966857 A
5966854 A
5966854 B
5966854 C
6648070 A
6648074 A
6648075 B

I wish to convert it to:

ID      Code_1 Code_2 Code_3
5966856 A      B      NaN
5966857 A      NaN    NaN
5966854 A      B      C
6648070 A      NaN    NaN
6648074 A      NaN    NaN
6648075 B      NaN    NaN

I tried groupby and pivot but in either case I need to define columns and in my case those columns are variable. The max number of columns is equal to the max Codes per unique ID. For the rest I populate as NaN.

Answer 1

Use:

first convert column Code to list s per group
then use DataFrame contructor
rename columns by custom function
reset_index for column from index

a = df.groupby('ID')['Code'].apply(list)
c = lambda x: 'Code_{}'.format(x+1)
df = pd.DataFrame(a.values.tolist(), index=a.index).rename(columns=c).reset_index()

Alternative:

create Series by cumcount for Counter , add 1 , cast to string and add from right by radd
set_index by column and Series
reshape by unstack
reset_index for column from index

a = df.groupby('ID')['Code'].cumcount().add(1).astype(str).radd('Code_')
df = df.set_index(['ID', a])['Code'].unstack().reset_index()

print (df)
        ID Code_1 Code_2 Code_3
0  5966854      A      B      C
1  5966856      A      B   None
2  5966857      A   None   None
3  6648070      A   None   None
4  6648074      A   None   None
5  6648075      B   None   None

How to covert dataframe with variable column size per row?

Question

1 answers

solution1
3 ACCPTED 2018-03-07 13:05:57

How to covert dataframe with variable column size per row?

Question

1 answers

solution1 3 ACCPTED 2018-03-07 13:05:57

solution1
3 ACCPTED 2018-03-07 13:05:57