繁体   English   中英

Pandas - 如何获取 DataFrame 中多列的行总和

[英]Pandas - How to get get sum of rows by multiple columns in a DataFrame

我有以下 Pandas DataFrame object df ,表示发生在 2000-07-01 到 2018-03-31 之间的事件。 每行代表在该特定日期发生的事件。 FID_1是索引列,可用于唯一标识每一行事件。 ICC_NAME列包含 33 个发生位置的唯一值。

                comb_date       ICC_NAME
FID_1                                   
267   2000-09-18 09:49:00      Alexandra
462   2000-10-19 01:00:00      Alexandra
696   2000-11-26 15:08:00      Alexandra
734   2000-11-27 19:20:00      Alexandra
760   2000-11-28 20:00:00      Alexandra
761   2000-11-28 20:30:00      Alexandra
945   2000-05-12 12:37:00      Alexandra
1242  2000-12-12 14:35:00      Alexandra
1440  2000-12-16 06:45:00      Alexandra
1523  2000-12-17 12:55:00      Alexandra
1701  2000-12-19 18:40:00      Alexandra
1899  2000-12-26 11:42:00      Alexandra
1963  2000-12-29 09:43:00      Alexandra
1975  2000-12-29 15:54:00      Alexandra
2004  2000-12-30 13:26:00      Alexandra
2044  2000-12-31 13:18:00      Alexandra
2100  2001-01-01 00:06:00      Alexandra
2202  2001-02-01 13:34:00      Alexandra
2826  2001-11-01 13:32:00      Alexandra
2991  2001-01-15 10:55:00      Alexandra
3175  2001-01-20 11:18:00      Alexandra
3176  2001-01-20 11:35:00      Alexandra
3212  2001-01-20 22:55:00      Alexandra
3371  2001-01-26 14:25:00      Alexandra
3386  2001-01-26 19:05:00      Alexandra
3395  2001-01-27 13:20:00      Alexandra
3432  2001-01-28 18:03:00      Alexandra
3701  2001-06-02 18:29:00      Alexandra
3881  2001-02-14 10:00:00      Alexandra
4131  2001-02-21 17:48:00      Alexandra
...                   ...            ...
...                   ...            ...
...                   ...          Boort
...                   ...          Boort
...                   ...            ...
...                   ...            ...
96968 2018-01-25 17:27:00  Woori Yallock
96983 2018-01-25 19:04:00  Woori Yallock
96995 2018-01-26 00:03:00  Woori Yallock
97002 2018-01-26 09:39:00  Woori Yallock
97105 2018-01-28 11:12:00  Woori Yallock
97143 2018-01-29 14:42:00  Woori Yallock
97144 2018-01-29 15:00:00  Woori Yallock
97160 2018-01-30 21:54:00  Woori Yallock
97249 2018-06-02 22:40:00  Woori Yallock
97314 2018-11-02 12:38:00  Woori Yallock
97361 2018-02-13 16:49:00  Woori Yallock
97362 2018-02-13 16:55:00  Woori Yallock
97368 2018-02-14 05:48:00  Woori Yallock
97446 2018-02-18 11:17:00  Woori Yallock
97475 2018-02-19 18:52:00  Woori Yallock
97485 2018-02-20 15:42:00  Woori Yallock
97496 2018-02-20 22:19:00  Woori Yallock
97514 2018-02-22 14:47:00  Woori Yallock
97563 2018-02-25 20:37:00  Woori Yallock
97641 2018-02-28 17:19:00  Woori Yallock
97642 2018-02-28 17:45:00  Woori Yallock
97769 2018-07-03 07:35:00  Woori Yallock
97786 2018-07-03 22:05:00  Woori Yallock
97902 2018-11-03 16:20:00  Woori Yallock
97938 2018-12-03 14:33:00  Woori Yallock
97939 2018-12-03 14:35:00  Woori Yallock
97946 2018-12-03 20:23:00  Woori Yallock
98046 2018-03-17 18:24:00  Woori Yallock
98090 2018-03-18 11:06:00  Woori Yallock
98207 2018-03-22 19:58:00  Woori Yallock

[98372 rows x 2 columns]

我想要实现的是获得每个 YYYY-MM 和每个 ICC_NAME 的事件总和。

yyyy-mm      Alexandra      Boort      ...      Woori Yallock
2000-07             29         12      ...                  8
2000-08             20         16      ...                 13
... ...
... ...
2018-03             41         8       ...                 28

我正在考虑使用 resample 但不确定 sum() 应该应用于哪一列。

使用crosstab ,通过Series.dt.to_period将日期时间转换为月份,最后更改索引,通过DataFrame.rename_axis将列名称转换为PeriodIndexDataFrame.reset_index

df['comb_date'] = pd.to_datetime(df['comb_date'])
df1 = (pd.crosstab(df['comb_date'].dt.to_period('m'), df['ICC_NAME'])
         .rename_axis(columns=None, index='yyy-mm')
         .reset_index())
print (df1)
     yyy-mm  Alexandra  Woori Yallock
0   2000-05          1              0
1   2000-09          1              0
2   2000-10          1              0
3   2000-11          4              0
4   2000-12          9              0
5   2001-01          9              0
6   2001-02          3              0
7   2001-06          1              0
8   2001-11          1              0
9   2018-01          0              8
10  2018-02          0             11
11  2018-03          0              3
12  2018-06          0              1
13  2018-07          0              2
14  2018-11          0              2
15  2018-12          0              3

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM