简体   繁体   中英

pandas groupby numbers for a new column

I have a df column "days" of 1000 row of records.

If the days less than 7.0 days (0-7) group as "1-6 days"

If the days more than 7.1 but less than 14.0 days (7.1 - 14.0) group as "7-14 days"

If the days more or equal to 15 days group as "> 14 days"

How can i create a new column "Days_Group" to represent the days grouping?

e.g of days values:
1 3.0
2 4.6
3 14.9
4 7.1
5 15.1
6 109

np.searchsorted

labels = np.array(['1-6 days', '7-14 days', '>14 days'])
bins = np.array([7, 14])

df.assign(Day_Group=labels[bins.searchsorted(df.days)])

    days  Day_Group
1    3.0   1-6 days
2    4.6   1-6 days
3   14.9   >14 days
4    7.1  7-14 days
5   15.1   >14 days
6  109.0   >14 days

Use pd.cut

df.assign(Day_Group=pd.cut(df['Days'],
                           [0,7,14,np.inf],
                           labels=['1-6 days','7-14 days','> 14 days']))

Output:

    Days  Day_Group
1    3.0   1-6 days
2    4.6   1-6 days
3   14.9  > 14 days
4    7.1  7-14 days
5   15.1  > 14 days
6  109.0  > 14 days

I think need cut :

import numpy as np

df['Days_Group'] = pd.cut(df['days'],
                          bins=[0,7,14,np.inf], 
                          labels=['1-6 days','7-14 days','> 14 days'],
                          include_lowest=True)
print (df)
    days Days_Group
1    3.0   1-6 days
2    4.6   1-6 days
3   14.9  > 14 days
4    7.1  7-14 days
5   15.1  > 14 days
6  109.0  > 14 days

df['Days_Group'] = pd.cut(df['days'],
                          bins=[0,7,14, pd.np.inf], 
                          labels=['1-6 days','7-14 days','> 14 days'],
                          include_lowest=True)
print (df)
    days Days_Group
1    3.0   1-6 days
2    4.6   1-6 days
3   14.9  > 14 days
4    7.1  7-14 days
5   15.1  > 14 days
6  109.0  > 14 days

EDIT: If timedeltas in days :

print (df)
               days
1   3 days 00:00:00
2   4 days 14:24:00
3  14 days 21:36:00
4   7 days 02:24:00
5  15 days 02:24:00
6 109 days 00:00:00

df['days'] = df['days'].dt.total_seconds() / 24 / 3600
print (df)
    days
1    3.0
2    4.6
3   14.9
4    7.1
5   15.1
6  109.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM