简体   繁体   English

如何计算最近日期和第二最近日期的行数之间的差异

[英]How to calculate difference between amount of rows for most recent date and second most recent date

I have the following df:我有以下df:

Index     Address     Date     
0  0x06b  2021-12-02  16:03:09.332
1  0x04t  2021-12-03  16:03:09.332
2  0x12c  2021-12-03  16:03:09.332
3  0x3d5  2021-12-04  16:03:09.332
4  0x077  2021-12-04  16:03:09.332
5  0x998  2021-12-04  16:03:09.332

I want to calculate the difference in amount of the rows ( len() of the column) between the most recent date ( t ), which in this case is 2021-12-04 16:03:09.332 )and the previous date ( t-1 ) but also for any previous date ( t-2, t-3, ..., tn ).我想计算最近日期( t )(在本例中为2021-12-04 16:03:09.332 )和前一个日期( t-1 )之间的行数(列的len() )的差异t-1 )但也适用于任何以前的日期( t-2, t-3, ..., tn )。

In this case, the answer for t - (t-1) should be 1, because the most recent date has 3 rows and the secod most recent date has 2 rows.在这种情况下,t - (t-1) 的答案应该是 1,因为最近的日期有 3 行,而最近的最近日期有 2 行。 3-2 = 1. 3-2 = 1。

I have tried implementing the solution in this StackOverflow post , but it does not seem to work.我已尝试在此 StackOverflow 帖子中实施该解决方案,但它似乎不起作用。

I take you want to calculate the delta of the number of records per day vs the latest available date - would the following achieve what you need:我认为您想计算每天的记录数与最新可用日期的增量 - 以下是否可以满足您的需求:

df2 = df.groupby("Address")[["Address"]].count().rename(columns={"Address": "count"})
df2.at['2021-12-04',"count"] - df2

OUTPUT OUTPUT

            count
Address
2021-12-02      2
2021-12-03      1
2021-12-04      0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM