简体   繁体   中英

How to groupby multiple columns to list in pandas dataframe

I've a dataframe df :

    A   B   C        date
O   4   5   5   2019-06-2
1   3   5   2   2019-06-2
2   3   2   1   2019-06-2
3   4   4   3   2019-06-3
4   5   4   6   2019-06-3
5   2   3   7   2019-06-3

Now i can groupby one column by using the following code:

df.groupby('datetime')['A'].apply(list)


         A         date
O   [4,3,3]   2019-06-2
1   [4,5,2]   2019-06-3

but what if want to group by multiple columns? I've tried something like this but i doesn't seems to be woring

df.groupby('datetime')[['A','B','C']].apply(list)

The end dataframe should look like this

    A               B         C        date
O   [4,3,3]   [5,5,2]   [5,2,1]   2019-06-2
1   [4,5,2]   [4,4,3]   [3,6,7]   2019-06-3

Use GroupBy.agg :

df1 = df.groupby('date')[['A','B','C']].agg(list).reset_index()
print (df1)
        date          A          B          C
0  2019-06-2  [4, 3, 3]  [5, 5, 2]  [5, 2, 1]
1  2019-06-3  [4, 5, 2]  [4, 4, 3]  [3, 6, 7]

EDIT: If want more function pass it in list:

df2 = df.groupby('date')[['A','B','C']].agg(['mean','min','max', list])
print (df2)
                  A                            B                            C  \
               mean min max       list      mean min max       list      mean   
date                                                                            
2019-06-2  3.333333   3   4  [4, 3, 3]  4.000000   2   5  [5, 5, 2]  2.666667   
2019-06-3  3.666667   2   5  [4, 5, 2]  3.666667   3   4  [4, 4, 3]  5.333333   


          min max       list  
date                          
2019-06-2   1   5  [5, 2, 1]  
2019-06-3   3   7  [3, 6, 7]  

Then get MultiIndex in columns , you can flatten it:

df2 = df.groupby('date')[['A','B','C']].agg(['mean','min','max', list])
df2.columns = df2.columns.map(lambda x: f'{x[0]}_{x[1]}')
df2 = df2.reset_index()
print (df2)
        date    A_mean  A_min  A_max     A_list    B_mean  B_min  B_max  \
0  2019-06-2  3.333333      3      4  [4, 3, 3]  4.000000      2      5   
1  2019-06-3  3.666667      2      5  [4, 5, 2]  3.666667      3      4   

      B_list    C_mean  C_min  C_max     C_list  
0  [5, 5, 2]  2.666667      1      5  [5, 2, 1]  
1  [4, 4, 3]  5.333333      3      7  [3, 6, 7]  

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM