简体   繁体   中英

Get value and key lists out of pandas groupBy

I am using pandas to create three arrays that I need for some stats. I need all the fields, the month and the number of finishes and starts in that month.

My dataframe is the following

           month  finish  started
0  MONTH.Mar       1        0
1  MONTH.Mar       1        0
2  MONTH.Mar       1        0
3  MONTH.Mar       1        0
4  MONTH.Mar       1        0
5  MONTH.Mar       0        1
6  MONTH.Apr       1        0
7  MONTH.Mar       0        1
8  MONTH.Mar       0        1
9  MONTH.Feb       0        1

I do a groupby:

df.groupby('month').sum()

and the output is the following:

           finish  started
month                     
MONTH.Apr       1        0
MONTH.Feb       0        1
MONTH.Mar       5        3

How can I convert the data into three different lists like this:

['MONTH.Apr','MONTH.Feb','MONTH.Mar']
[1,0,5]
[0,1,3]

I tried to do frame.values.tolist() but the output was the following:

[[1, 0], [0, 1], [5, 3]]

and it was impossible to get the months.

IIUC, try reset_index() and transposing .T :

>>> df.groupby('month').sum().reset_index().T.to_numpy()
array([['MONTH.Apr', 'MONTH.Feb', 'MONTH.Mar'],
       [1, 0, 5],
       [0, 1, 3]], dtype=object)

Or:

>>> df.groupby('month').sum().reset_index().T.values.tolist()
[['MONTH.Apr', 'MONTH.Feb', 'MONTH.Mar'], [1, 0, 5], [0, 1, 3]]

You can use:

month, finish, started = df.groupby('month', as_index=False) \
                           .sum().to_dict('list').values()

Output:

>>> month
['MONTH.Apr', 'MONTH.Feb', 'MONTH.Mar']

>>> finish
[1, 0, 5]

>>> started
[0, 1, 3]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM