[英]find time difference between groups in dataframe with python
I'm using python's pandas.我正在使用 python 的 pandas。
I'm having the following orders dataframe. when each order have its order id, order time and different items id in the order.我有以下订单 dataframe。当每个订单都有其订单 ID、订单时间和订单中的不同项目 ID 时。 in this example I have three different groups - A,B,C:
在此示例中,我有三个不同的组 - A、B、C:
order_id time item_id
0 A 2022-11-10 08:43:07 1
1 A 2022-11-10 08:43:07 2
2 A 2022-11-10 08:43:07 3
3 B 2022-11-10 08:46:27 1
4 B 2022-11-10 08:46:27 2
5 C 2022-11-10 08:58:45 3
I want to calculate the time difference between group A and B and then between group B and C, by the time order and save the result into another column我想按时间顺序计算A组和B组之间的时间差,然后再计算B组和C之间的时间差,并将结果保存到另一列
wanted result:想要的结果:
order_id time item_id time_diff
0 A 2022-11-10 08:43:07 1
1 A 2022-11-10 08:43:07 2
2 A 2022-11-10 08:43:07 3
3 B 2022-11-10 08:46:27 1 0 days 00:03:20
4 B 2022-11-10 08:46:27 2 0 days 00:03:20
5 C 2022-11-10 08:58:45 3 0 days 00:12:18
how can I calculate the time difference between the groups when the time is similar for the entire group?当整个组的时间相似时,如何计算组之间的时间差?
try using.diff() but I got only the difference inside the group:尝试 using.diff() 但我只得到组内的差异:
df['time_diff'] = df.groupby('order_id')['time'].diff()
df
Out[141]:
order_id time item_id time_diff
0 A 2022-11-10 08:43:07 1 NaT
1 A 2022-11-10 08:43:07 2 0 days
2 A 2022-11-10 08:43:07 3 0 days
3 B 2022-11-10 08:46:27 1 NaT
4 B 2022-11-10 08:46:27 2 0 days
5 C 2022-11-10 08:58:45 3 NaT
I want the difference between the groups and not inside.我想要组之间的区别,而不是内部的区别。 I can calculate the difference with.last().diff() but I don't know how to save it as a column back to the dataframe:
我可以用 .last().diff() 计算差异,但我不知道如何将它作为列保存回 dataframe:
df.groupby('order_id')['time'].last().diff().to_frame('time_diff')
Out[]:
time_diff
order_id
A NaT
B 0 days 00:03:20
C 0 days 00:12:18
thanks谢谢
You were on the right track.你走在正确的轨道上。 This will work for you:
这对你有用:
diff = df.groupby('order_id')['time'].last().diff().to_frame('time_diff').reset_index()
df = df.merge(diff, on='order_id', how='left')
df
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.