简体   繁体   中英

How to group text in one row and calculate the time duration in python pandas?

I have a dataframe like this,

ID    time       text
1   8:43:43 PM   one day
1   8:43:51 PM   this code
1   8:44:07 PM   will help
1   8:44:17 PM   someone.
2   8:45:56 AM   yes
2   8:46:09 AM   I'm feeling
2   8:46:25 AM   good.

I want to group the time column by ID and calculate the time duration. I know we can use join to concat text and group by each ID.

The final output will be,

ID   time-duration    text
1    34        one day this code will help someone.
2    29        yes I'm feeling good.

Use GroupBy.agg with named aggregations (best practice from pandas >= 0.25.0 )

The advantage of named aggregations is that we aggregate and at the same time rename our column, see time_duration in output.

df['time'] = pd.to_datetime(df['time'])

dfg = df.groupby('ID').agg(
    time_duration=('time', lambda x: x.max()-x.min()),
    text=('text', ' '.join)
).reset_index()
   ID time_duration                                  text
0   1      00:00:34  one day this code will help someone.
1   2      00:00:29                 yes I'm feeling good.

We can do

df.groupby('ID').agg({'time':np.ptp,'text':' '.join})
Out[49]:  
       time                                  text
ID                                               
1  00:00:34  one day this code will help someone.
2  00:00:29                 yes I'm feeling good.

Groupby and aggregation:

(df.groupby('ID', as_index=False)
   .agg({'time': lambda x: (x.max() - x.min()).total_seconds(),
         'text': ' '.join})
)

Output:

   ID  time                                  text
0   1  34.0  one day this code will help someone.
1   2  29.0                 yes I'm feeling good.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM