熊猫：Groupby并在一个团体内切割

Question

I have a pandas dataframe which looks like this: 我有一个pandas数据框，如下所示：

userid   name       date
1           name1    2016-06-04
1           name2    2016-06-05
1           name3    2016-06-04
1           name1    2016-06-06
2           name23   2016-06-01
2           name2    2016-06-01
3           name1    2016-06-03
3           name6    2016-06-03
3           name12   2016-06-03
3           name65   2016-06-04

So, I want to retain only the rows of the users till the first date events, and cut the rest. 所以，我想只保留用户的行直到第一个日期事件，并切断其余部分。

The final df would be as follows: 最终的df如下：

userid   name       date
1           name1    2016-06-04
1           name2    2016-06-04
2           name23   2016-06-01
2           name2    2016-06-01
3           name1    2016-06-03
3           name6    2016-06-03
3           name12   2016-06-03



userid     int64
name      object
time      object

The type() of data points in the time column is a datetime.date 时间列中数据点的type()是datetime.date

So, the tasks would involve grouping with respect to userid , sorting according to the date , then retaining only the rows with first(/earliest) date . 因此，任务将涉及grouping with respect to userid ， sorting according to the date ，然后retaining only the rows with first(/earliest) date 。

Answer 1

You can first sort DataFrame by column date by sort_values and then groupby with apply boolean indexing - get all rows where is first values: 你可以先排序DataFrame由列date通过sort_values然后groupby与apply boolean indexing -让所有行是第一值：

df = df.sort_values('date')
       .groupby('userid')
       .apply(lambda x: x[x.date == x.date.iloc[0]])
       .reset_index(drop=True)

print (df)
   userid    name       date
0       1   name1 2016-06-04
1       1   name3 2016-06-04
2       2  name23 2016-06-01
3       2   name2 2016-06-01
4       3   name1 2016-06-03
5       3   name6 2016-06-03
6       3  name12 2016-06-03

熊猫：Groupby并在一个团体内切割

问题描述

1 个解决方案

解决方案1
3 已采纳 2016-09-07 06:14:50

熊猫：Groupby并在一个团体内切割

问题描述

1 个解决方案

解决方案1 3 已采纳 2016-09-07 06:14:50

解决方案1
3 已采纳 2016-09-07 06:14:50