[英]Calculation of values in columns and indexes in multiindex pandas pivot
my multiindex pandas pivot df looks like this:我的多索引熊猫数据透视 df 看起来像这样:
Date 2019-10-01 11:00 2019-10-01 12:00 2019-10-01 13:00 ... 2019-10-29 17:00
ID 25 24 25 ... 24
H_name
Hospital1 12 15 16 ... 12
Hospital2 10 17 14 ... 12
Hospital3 15 20 12 ... 12
I would like to get:我想得到:
Date 2019-10-01 2019-10-02 2019-10-03
ID 25.45 24.33 23.71
H_name
Hospital1 253 287 261
Hospital2 212 232 264
Hospital3 221 219 223
The value for the 'H_name' is the sum of all hours of the day and the 'ID' is the average of all hours of the day. “H_name”的值是一天中所有小时的总和,“ID”是一天中所有小时的平均值。 Thank you for your help =)
谢谢你的帮助 =)
My df before pivot支点前我的 df
H_name Date ID Value
0 Hospital1 2019-10-01 11:00 25 12
1 Hospital2 2019-10-01 11:00 25 10
2 Hospital3 2019-10-01 11:00 25 15
3 Hospital1 2019-10-01 12:00 24 15
4 Hospital2 2019-10-01 12:00 24 17
5 Hospital3 2019-10-01 12:00 24 20
.... .... ... ...
680 Hospital1 2019-10-30 15:00 20 11
681 Hospital2 2019-10-30 15:00 20 18
682 Hospital3 2019-10-30 15:00 20 17
If I understand you correctly, you want to group the data by date ( Value
by np.sum
and ID
by np.mean
), and then make pivot table afterwards:如果我理解正确,您想按日期对数据进行分组(
Value
by np.sum
和ID
by np.mean
),然后制作数据透视表:
import numpy as np
import pandas as pd
h_name = ['Hospital1', 'Hospital2', 'Hospital3', 'Hospital1', 'Hospital2', 'Hospital3',
'Hospital1', 'Hospital2', 'Hospital3', 'Hospital1', 'Hospital2', 'Hospital3']
date = ['2019-10-01 11:00', '2019-10-01 11:00', '2019-10-01 11:00', '2019-10-01 12:00', '2019-10-01 12:00', '2019-10-01 12:00',
'2019-10-02 11:00', '2019-10-02 11:00', '2019-10-02 11:00', '2019-10-02 12:00', '2019-10-02 12:00', '2019-10-02 12:00']
ids = [25, 25, 25, 24, 24, 24,
23, 23, 23, 22, 22, 22]
value = [12, 10, 15, 15, 17, 20,
15, 16, 17, 14, 13, 22]
df = pd.DataFrame({'H_name': h_name, 'Date': date, 'ID': ids, 'Value': value})
df['Date'] = pd.to_datetime(df['Date'], utc=False)
print(df)
The data in df
looks like: df
的数据如下所示:
H_name Date ID Value
0 Hospital1 2019-10-01 11:00:00 25 12
1 Hospital2 2019-10-01 11:00:00 25 10
2 Hospital3 2019-10-01 11:00:00 25 15
3 Hospital1 2019-10-01 12:00:00 24 15
4 Hospital2 2019-10-01 12:00:00 24 17
5 Hospital3 2019-10-01 12:00:00 24 20
6 Hospital1 2019-10-02 11:00:00 23 15
7 Hospital2 2019-10-02 11:00:00 23 16
8 Hospital3 2019-10-02 11:00:00 23 17
9 Hospital1 2019-10-02 12:00:00 22 14
10 Hospital2 2019-10-02 12:00:00 22 13
11 Hospital3 2019-10-02 12:00:00 22 22
Then:然后:
df['Date_1'] = df.Date.dt.date
df = df.set_index('H_name').groupby(['H_name', 'Date_1']).agg({'ID':np.mean, 'Value':np.sum})
print(df.pivot_table(index='H_name', columns=['Date_1', 'ID'], values='Value'))
Prints:印刷:
Date_1 2019-10-01 2019-10-02
ID 24.5 22.5
H_name
Hospital1 27 29
Hospital2 27 29
Hospital3 35 39
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.