[英]Python Pandas - Produce a sum total of a column that either has a value '1' in it or NaN
我有一個看起來像這樣的數據框
Opened Closed Resolved 07:01 - 09:00 09:01 - 11:00 11:01 - 13:00
2020-04-14 14:45:58 2020-04-14 15:04:22 0 days 00:18:24.000000000 1
2020-04-14 13:43:28 2020-04-14 14:12:22 0 days 00:28:54.000000000 1
2020-04-14 13:41:18 2020-04-14 14:12:28 0 days 00:31:10.000000000 1
2020-04-14 10:57:53 2020-04-14 11:24:58 0 days 00:27:05.000000000 1
2020-04-14 09:18:14 2020-04-14 09:44:04 0 days 00:25:50.000000000 1
2020-04-14 09:16:28 2020-04-14 09:31:12 0 days 00:14:44.000000000 1
2020-04-13 22:56:09 2020-04-14 00:39:30 0 days 01:43:21.000000000
2020-04-13 20:10:31 2020-04-13 20:26:25 0 days 00:15:54.000000000
2020-04-13 08:29:38 2020-04-13 18:29:25 0 days 09:59:47.000000000 1
2020-04-09 14:04:14 2020-04-09 15:31:01 0 days 01:26:47.000000000 1
2020-04-09 10:06:24 2020-04-09 10:33:39 0 days 00:27:15.000000000 1
2020-04-08 21:38:13 2020-04-09 07:01:30 0 days 09:23:17.000000000
2020-04-08 15:51:41 2020-04-08 16:08:02 0 days 00:16:21.000000000 1
2020-04-08 15:50:09 2020-04-08 16:07:57 0 days 00:17:48.000000000 1
2020-04-08 15:48:38 2020-04-08 16:07:52 0 days 00:19:14.000000000 1
我想在每列的底部生成所有“1”值的總和,所以它看起來像這樣。
Opened Closed Resolved 07:01 - 09:00 09:01 - 11:00 11:01 - 13:00
2020-04-14 14:45:58 2020-04-14 15:04:22 0 days 00:18:24.000000000 1
2020-04-14 13:43:28 2020-04-14 14:12:22 0 days 00:28:54.000000000 1
2020-04-14 13:41:18 2020-04-14 14:12:28 0 days 00:31:10.000000000 1
2020-04-14 10:57:53 2020-04-14 11:24:58 0 days 00:27:05.000000000 1
2020-04-14 09:18:14 2020-04-14 09:44:04 0 days 00:25:50.000000000 1
2020-04-14 09:16:28 2020-04-14 09:31:12 0 days 00:14:44.000000000 1
2020-04-13 22:56:09 2020-04-14 00:39:30 0 days 01:43:21.000000000
2020-04-13 20:10:31 2020-04-13 20:26:25 0 days 00:15:54.000000000
2020-04-13 08:29:38 2020-04-13 18:29:25 0 days 09:59:47.000000000 1
2020-04-09 14:04:14 2020-04-09 15:31:01 0 days 01:26:47.000000000 1
2020-04-09 10:06:24 2020-04-09 10:33:39 0 days 00:27:15.000000000 1
2020-04-08 21:38:13 2020-04-09 07:01:30 0 days 09:23:17.000000000
2020-04-08 15:51:41 2020-04-08 16:08:02 0 days 00:16:21.000000000 1
2020-04-08 15:50:09 2020-04-08 16:07:57 0 days 00:17:48.000000000 1
2020-04-08 15:48:38 2020-04-08 16:07:52 0 days 00:19:14.000000000 1
Total 5 3 4
12
所以每一列都有自己的總結,然后所有列也有一個總數。
我試過了
data.groupby('Total')["07:01 - 09:00"].sum()[1]
但這會輸出一長串'1' 11111111111
我如何實際獲得總數?
鑒於您提供的示例,可能您的三 (3) 列名為 ("07:01 - 09:00, 09:01 - 11:00, 11:01 - 13:00") 是 dtype: str,這就是您的原因得到一長串的 1。 就是說,您應該將列轉換為浮動,如下所示:
data['07:01 - 09:00'] = data['07:01 - 09:00'].astype(float)
data['09:01 - 11:00'] = data['09:01 - 11:00'].astype(float)
data['11:01 - 13:00'] = data['11:01 - 13:00'].astype(float)
之后,您可以嘗試以下操作:
data = data.append(data[['07:01 - 09:00', '09:01 - 11:00', '11:01 - 13:00']].sum(),
ignore_index=True)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.