Python/Pandas Binning Data Timedelta

Question

I have a DataFrame with two columns

    userID     duration
0   DSm7ysk    03:08:49
1   no51CdJ    00:35:50
2   ...

with 'duration' having type timedelta. I have tried using

bins = [dt.timedelta(minutes = 0), dt.timedelta(minutes = 
        5),dt.timedelta(minutes = 10),dt.timedelta(minutes = 
        20),dt.timedelta(minutes = 30), dt.timedelta(hours = 4)]

labels = ['0-5min','5-10min','10-20min','20-30min','30min+']

df['bins'] = pd.cut(df['duration'], bins, labels = labels)

However, the binned data doesn't use the specified bins, but created on for each duration in the frame.

What is the simplest way to bin timedelta objects into irregular bins? Or am I just missing something obvious here?

Answer 1

It works for me with pandas 0.23.4

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'userID': ['DSm7ysk', 'no51CdJ', 'foo', 'bar'],
    'duration': [pd.Timedelta('3 hours 8 minutes 49 seconds'), pd.Timedelta('35 minutes 50 seconds'), pd.Timedelta('1 minutes 13 seconds'), pd.Timedelta('6 minutes 43 seconds')]
})

bins = [
    pd.Timedelta(minutes = 0),
    pd.Timedelta(minutes = 5),
    pd.Timedelta(minutes = 10),
    pd.Timedelta(minutes = 20),
    pd.Timedelta(minutes = 30),
    pd.Timedelta(hours = 4)
]

labels = ['0-5min', '5-10min', '10-20min', '20-30min', '30min+']

df['bins'] = pd.cut(df['duration'], bins, labels = labels)

Result:

Answer 2

You can normalize to seconds before binning. This reduces the problem to binning integers.

df = pd.DataFrame({'userID': ['A', 'B'],
                   'duration': pd.to_timedelta(['00:08:49', '00:35:50'])})

L = ['00:00:00', '00:05:00', '00:10:00', '00:20:00', '00:30:00', '04:00:00']

bins = pd.to_timedelta(L).total_seconds()
cats = ['0-5min', '5-10min', '10-20min', '20-30min', '30min+']

df['bins'] = pd.cut(df['duration'].dt.total_seconds(), bins, labels=cats)

print(df)

#    duration userID     bins
# 0  00:08:49      A  5-10min
# 1  00:35:50      B   30min+

Python/Pandas Binning Data Timedelta

Question

2 answers

solution1
0 2019-01-16 18:18:09

solution2
0 2019-01-16 18:30:31

Python/Pandas Binning Data Timedelta

Question

2 answers

solution1 0 2019-01-16 18:18:09

solution2 0 2019-01-16 18:30:31

solution1
0 2019-01-16 18:18:09

solution2
0 2019-01-16 18:30:31