简体   繁体   English

Dataframe - 将行转换为列 - 按另一列分组

[英]Dataframe - convert rows to columns - grouped by another columns

I am looking to convert a data frame as below我正在寻找如下转换数据框

Original dataset原始数据集

Group团体 Miles英里
A一个 23 23
A一个 20 20
A一个 24 24
A一个 25 25
B 12 12
B 17 17
B 16 16
B 19 19

I want to convert from above format to this:我想从上述格式转换为:

Col_A可乐 Col_B Col_B
23 23 12 12
20 20 17 17
24 24 16 16
25 25 19 19

TRY via pivot :通过pivot尝试:

df = df.assign(t= df.groupby('Group').cumcount()).pivot(index = 't', columns ='Group', values = 'Miles').add_prefix('Col_').rename_axis(columns = None).reset_index(drop = True)

OR via pd.concat :或通过pd.concat

k = pd.concat([g.reset_index(drop=True)['Miles'] for _,g in df.groupby('Group')], 1)
k.columns = ['colA', 'colB']

One more option via set_index / unstack :通过set_index / unstack的另一种选择:

k = df.set_index(['Group', df.groupby('Group').cumcount()]).unstack(0).add_prefix('Col_').rename_axis(columns= [None,None])
k.columns = k.columns.droplevel()

One more via groupby / explode :另一个通过groupby / explode

k = df.groupby('Group').agg(list).T.apply(pd.Series.explode).add_prefix('Col_')
k = k.reset_index(drop=True).rename_axis(columns = None)

OUTPUT: OUTPUT:

   Col_A  Col_B
0     23     12
1     20     17
2     24     16
3     25     19

A pivot_table option:一个pivot_table选项:

df = (
    df.pivot_table(index=df.groupby('Group').cumcount(),
                   columns='Group',
                   values='Miles')
        .add_prefix('Col_')
        .rename_axis(columns=None)
)

df : df

   Col_A  Col_B
0     23     12
1     20     17
2     24     16
3     25     19

Explaination:解释:

Create a new index based on the relative position in each group with groupby cumcount :使用groupby cumcount根据每个组中的相对 position 创建一个新索引:

df.groupby('Group').cumcount()
Group  new_index
    A          0
    A          1
    A          2
    A          3
    B          0
    B          1
    B          2
    B          3

Then Group can become the new columns in the wide format Frame.然后Group可以成为宽格式 Frame 中的新列。

df.pivot_table(index=df.groupby('Group').cumcount(),
                   columns='Group',
                   values='Miles')
Group   A   B
0      23  12
1      20  17
2      24  16
3      25  19

Then some cleanup with add_prefix + rename_axis :然后使用add_prefix + rename_axis进行一些清理:

df.pivot_table(index=df.groupby('Group').cumcount(),
               columns='Group',
               values='Miles')
    .add_prefix('Col_')
    .rename_axis(columns=None)
   Col_A  Col_B
0     23     12
1     20     17
2     24     16
3     25     19

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM