简体   繁体   English

熊猫-如何过滤“最频繁的” Datetime对象

[英]pandas - how to filter “most frequent” Datetime objects

I'm working with a DataFrame like the following: 我正在使用如下所示的DataFrame:

User_ID    Datetime
01    2014-01-01 08:00:00
01    2014-01-02 09:00:00
02    2014-01-02 10:00:00
02    2014-01-03 11:00:00
03    2014-01-04 12:00:00
04    2014-01-04 13:00:00
05    2014-01-02 14:00:00

I would like to filter Users under certain conditions based on the Datetime columns, eg filter only Users with one occurrence / month, or only Users with occurrences only in summer etc. 我想根据“日期时间”列在某些条件下过滤用户,例如,仅过滤具有一个事件/月的用户,或仅过滤具有夏季的事件的用户,等等。

So far I've group the df with: 到目前为止,我已经将df与分组:

g = df.groupby(['User_ID','Datetime']).size()

obtaining the "traces" in time of each User: 获取每个用户的时间“跟踪”:

User_ID    Datetime
01    2014-01-01 08:00:00
      2014-01-02 09:00:00
02    2014-01-02 10:00:00
      2014-01-03 11:00:00
03    2014-01-04 12:00:00
04    2014-01-04 13:00:00
05    2014-01-02 14:00:00

Then I applied a mask to filter, for instance, the Users with more than one trace: 然后,我应用了一个蒙版来过滤例如具有多个跟踪的Users:

mask = df.groupby('User_ID')['Datetime'].apply(lambda g: len(g)>1)
df = df[df['User_ID'].isin(mask[mask].index)]

So this is fine. 所以这很好。 I'm looking for a function instead of the lambda g: len(g)>1 able to filter Users under different conditions, as I said before. 我正在寻找一个函数来代替lambda g: len(g)>1能够在不同条件下过滤用户,就像我之前说的那样。 In particular filter Users with with one occurrence / month. 尤其是过滤每月出现一次的用户。

So long as your 'Datetime' dtype is already a datetime and you are running pandas version 0.15.0 or higher then you can groupby the month in addition to the user id and then filter the results by checking the length of the group: 只要您的'Datetime'dtype已经是一个日期时间,并且您正在运行熊猫版本0.15.0或更高版本,那么除了用户ID以外,您还可以按月份分组,然后通过检查组的长度来过滤结果:

In [29]:

df.groupby(['User_ID',df['Datetime'].dt.month]).filter(lambda x: len(x) > 1)
Out[29]:
   User_ID            Datetime
0        1 2014-01-01 08:00:00
1        1 2014-01-02 09:00:00
2        2 2014-01-02 10:00:00
3        2 2014-01-03 11:00:00

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM