简体   繁体   中英

Dataframe - convert rows to columns - grouped by another columns

I am looking to convert a data frame as below

Original dataset

Group Miles
A 23
A 20
A 24
A 25
B 12
B 17
B 16
B 19

I want to convert from above format to this:

Col_A Col_B
23 12
20 17
24 16
25 19

TRY via pivot :

df = df.assign(t= df.groupby('Group').cumcount()).pivot(index = 't', columns ='Group', values = 'Miles').add_prefix('Col_').rename_axis(columns = None).reset_index(drop = True)

OR via pd.concat :

k = pd.concat([g.reset_index(drop=True)['Miles'] for _,g in df.groupby('Group')], 1)
k.columns = ['colA', 'colB']

One more option via set_index / unstack :

k = df.set_index(['Group', df.groupby('Group').cumcount()]).unstack(0).add_prefix('Col_').rename_axis(columns= [None,None])
k.columns = k.columns.droplevel()

One more via groupby / explode :

k = df.groupby('Group').agg(list).T.apply(pd.Series.explode).add_prefix('Col_')
k = k.reset_index(drop=True).rename_axis(columns = None)

OUTPUT:

   Col_A  Col_B
0     23     12
1     20     17
2     24     16
3     25     19

A pivot_table option:

df = (
    df.pivot_table(index=df.groupby('Group').cumcount(),
                   columns='Group',
                   values='Miles')
        .add_prefix('Col_')
        .rename_axis(columns=None)
)

df :

   Col_A  Col_B
0     23     12
1     20     17
2     24     16
3     25     19

Explaination:

Create a new index based on the relative position in each group with groupby cumcount :

df.groupby('Group').cumcount()
Group  new_index
    A          0
    A          1
    A          2
    A          3
    B          0
    B          1
    B          2
    B          3

Then Group can become the new columns in the wide format Frame.

df.pivot_table(index=df.groupby('Group').cumcount(),
                   columns='Group',
                   values='Miles')
Group   A   B
0      23  12
1      20  17
2      24  16
3      25  19

Then some cleanup with add_prefix + rename_axis :

df.pivot_table(index=df.groupby('Group').cumcount(),
               columns='Group',
               values='Miles')
    .add_prefix('Col_')
    .rename_axis(columns=None)
   Col_A  Col_B
0     23     12
1     20     17
2     24     16
3     25     19

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM