Pandas group_by date and resample

Question

I have some data frame that looks like this:

    A   B   C   date
0   J   Y   2   2013-02-01 14:21:02.070030
1   X   X   0   2013-02-01 15:49:33.110849
2   Y   D   9   2013-02-01 06:47:19.369514
3   Y   C   17  2013-02-01 08:56:11.751781
4   3   J   21  2013-02-01 14:19:12.017232

I'd like to group by date and then count, but omit the information about the hours, minutes, seconds, etc.

It seems like something like this works:

df.set_index('date').resample('D').count()

Two questions:

Why does that work? Is that the right way?
Why doesn't something like df.group_by('date').resample('D').count() work?

Answer 1

resample is in some sense just a special case of groupby - rather than grouping on distinct values, which is what grouppy('date') would do, it groups a time-based transformation of the index, which is why you need to set the index. Alternatively, you could do:

df.groupby(pd.Grouper(key='date', freq='D')).count()

In the upcoming version 0.19.0 you'll be able to write the above like this.

df.resample('D', on='date').count()

Pandas group_by date and resample

Question

1 answers

solution1
4 2016-09-09 00:36:04

Pandas group_by date and resample

Question

1 answers

solution1 4 2016-09-09 00:36:04

solution1
4 2016-09-09 00:36:04