Unusual reshaping of Pandas DataFrame

Question

i have a DF like this:

df = pd.DataFrame({'x': ['a', 'a', 'b', 'b', 'b', 'c'],
                   'y': [1, 2, 3, 4, 5, 6],
                 })

which looks like:

I need to reshape it in the way to keep 'x' column unique:

   x    y_1  y_2  y_3
0  a    1    2    NaN
1  b    3    4    5
2  c    6    NaN  NaN

So the max N of 'y_N' columns have to be equal to

max(df.groupby('x').count().values)

and the x column has to contain unique values.

For now i dont get how to get y_N columns.

Thanks.

Answer 1

You can use pandas.crosstab with cumcount column as the columns parameter:

(pd.crosstab(df.x, df.groupby('x').cumcount() + 1, df.y, 
            aggfunc = lambda x: x.iloc[0])
   .rename(columns="y_{}".format).reset_index())