简体   繁体   English

每天熊猫总数

[英]Pandas total count each day

I have a large dataset (df) with lots of columns and I am trying to get the total number of each day.我有一个包含很多列的大型数据集 (df),我正在尝试获取每天的总数。

    |datetime|id|col3|col4|col...
1   |11-11-2020|7|col3|col4|col...
2   |10-11-2020|5|col3|col4|col...
3   |09-11-2020|5|col3|col4|col...
4   |10-11-2020|4|col3|col4|col...
5   |10-11-2020|4|col3|col4|col...
6   |07-11-2020|4|col3|col4|col...

I want my result to be something like this我希望我的结果是这样的

    |datetime|id|col3|col4|col...|Count
6   |07-11-2020|4|col3|col4|col...| 1
3              |5|col3|col4|col...| 1
2   |10-11-2020|5|col3|col4|col...| 1
4              |4|col3|col4|col...| 2
1   |11-11-2020|7|col3|col4|col...| 1

I tried to use resample like this df = df.groupby(['id','col3', pd.Grouper(key='datetime', freq='D')]).sum().reset_index() and this is my result.我试图像这样使用重采样df = df.groupby(['id','col3', pd.Grouper(key='datetime', freq='D')]).sum().reset_index()和这个是我的结果。 I am still new to programming and Pandas but I have read up on pandas docs and am still unable to do it.我仍然是编程和 Pandas 的新手,但我已经阅读了 Pandas 文档,但仍然无法做到。

    |datetime|id|col3|col4|col...
6   |07-11-2020|4|col3|1|0.0
3   |07-11-2020|5|col3|1|0.0
2   |10-11-2020|5|col3|1|0.0
4   |10-11-2020|4|col3|2|0.0
1   |11-11-2020|7|col3|1|0.0

尝试这个:

df = df.groupby(['datetime','id','col3']).count()

If you want the count values for all columns based only on the date, then:如果您希望所有列的计数值仅基于日期,则:

df.groupby('datetime').count()

And you'll get a DataFrame who has the date time as the index and the column cells representing the number of entries for that given index.您将获得一个 DataFrame,它以日期时间作为索引,列单元格表示该给定索引的条目数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM