简体   繁体   English

通过python3.0将Dataframe [groupby()]分组

[英]Grouping Dataframe[groupby()] by python3.0

 df = pd.DataFrame({'order':['A', 'B', 'C', 'D', 'E', 'F'],'quantity':[1,1,2,3,3,4]})


df_out = df.order.repeat(df.quantity).reset_index(drop=True).to_frame()
df_out['grp'] = df_out.index // 4
df_out.groupby(['grp','order'])['order'].count().to_frame(name='quantity')

output : 输出:

      quantity
grp order          
0   A             1
    B             1
    C             2
1   D             3
    E             1
2   E             2
    F             2
3   F             2

In groupby() function I have got my desired result. 在groupby()函数中,我得到了想要的结果。 But when i try to Concat() with df1 , 但是当我尝试使用df1 Concat()时,

df1 = pd.DataFrame({'order':['A', 'B', 'C', 'D', 'E', 'F'],'quantity':[1,1,2,3,3,4]})

I found that 0 from grp is assigned to only first row 我发现grp中的0仅分配给第一行

grp order
0 A 1

not as 不像

quantity grp order
0 A 1 0 B 1 0 C 2

How can I solve this problem ? 我怎么解决这个问题 ?

What you get after groupby(*multiple_columns*).*some_action* is a Dataframe with MultiIndex . groupby(*multiple_columns*).*some_action*是带有DataframeMultiIndex You can reset it: 您可以重置它:

ans = (
    df_out
    .groupby(['grp', 'order'])['order']
    .count()
    .to_frame(name='quantity')
    .reset_index())

Then you may use any column as index and drop that column: 然后,您可以将任何列用作索引并将其删除:

ans.index = ans['grp']
ans = ans.drop('grp', axis=1)

ans is: ans是:

    order  quantity
grp                
0       A         1
0       B         1
0       C         2
1       D         3
1       E         1
2       E         2
2       F         2
3       F         2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM