熊猫groupby日期每天选择最早

Question

I have the following dataset: 我有以下数据集：

            value            timestamp
0            Fire  2017-10-03 14:33:52
1           Water  2017-10-04 14:33:48
2            Fire  2017-10-04 14:33:45
3            Fire  2017-10-05 14:33:30
4           Water  2017-10-03 14:33:40
5           Water  2017-10-05 14:32:13
6           Water  2017-10-04 14:32:01
7            Fire  2017-10-03 14:31:55

I want to group this set by timestamp per day and then only select the earliest row per day. 我想按每天的timestamp将此集合分组，然后仅选择每天最早的行。 For the above example the following should be the result: 对于上面的示例，结果应为：

            value            timestamp
1           Water  2017-10-05 14:32:13
2           Water  2017-10-04 14:32:01
3            Fire  2017-10-03 14:31:55

For example, for the day 2017-10-03 there are 3 entries but I only want the earliest on that day. 例如，对于2017-10-03 3日这一天，有3个条目，但是我只希望最早的那一天。

Answer 1

If you have unique index, you can use idxmin on timestamp to find out the indices of the minimum timestamp and extract them with loc : 如果您有唯一索引，则可以在timestamp上使用idxmin来找出最小时间戳的索引，并使用loc提取它们：

df.timestamp = pd.to_datetime(df.timestamp)
df.loc[df.groupby(df.timestamp.dt.date, as_index=False).timestamp.idxmin()]

#   value             timestamp
#7   Fire   2017-10-03 14:31:55
#6  Water   2017-10-04 14:32:01
#5  Water   2017-10-05 14:32:13

Answer 2

Just Making Sure 只是确定

df.timestamp = pd.to_datetime(df.timestamp)

Solution 解

d1 = df.sort_values('timestamp')
d1[~d1.timestamp.dt.date.duplicated()]

   value           timestamp
7   Fire 2017-10-03 14:31:55
6  Water 2017-10-04 14:32:01
5  Water 2017-10-05 14:32:13

Answer 3

Use dt.floor and head : 使用dt.floor和head ：

df.sort_values('timestamp').groupby(df['timestamp'].dt.floor('D')).head(1)

Output: 输出：

   value           timestamp
7   Fire 2017-10-03 14:31:55
6  Water 2017-10-04 14:32:01
5  Water 2017-10-05 14:32:13

Answer 4

Or 要么

df.groupby(df.timestamp.dt.date).apply(lambda x:x[x.timestamp==min(x.timestamp)])
Out[714]: 
              value           timestamp
timestamp                              
2017-10-03 7   Fire 2017-10-03 14:31:55
2017-10-04 6  Water 2017-10-04 14:32:01
2017-10-05 5  Water 2017-10-05 14:32:13

熊猫groupby日期每天选择最早

问题描述

4 个解决方案

解决方案1
4 已采纳 2017-10-06 19:43:14

解决方案2
3 2017-10-06 19:56:43

解决方案3
2 2017-10-06 19:43:56

解决方案4
1 2017-10-06 19:44:12

熊猫groupby日期每天选择最早

问题描述

4 个解决方案

解决方案1 4 已采纳 2017-10-06 19:43:14

解决方案2 3 2017-10-06 19:56:43

解决方案3 2 2017-10-06 19:43:56

解决方案4 1 2017-10-06 19:44:12

解决方案1
4 已采纳 2017-10-06 19:43:14

解决方案2
3 2017-10-06 19:56:43

解决方案3
2 2017-10-06 19:43:56

解决方案4
1 2017-10-06 19:44:12