[英]pandas pivot-table how to add nested columns
df DF
billsec disposition Date Hour
0 185 ANSWERED 2016-11-01 00
1 0 NO ANSWER 2016-11-01 00
2 41 ANSWERED 2016-11-01 01
3 4 ANSWERED 2016-12-02 05
There is a table, me need to get out of it a summary table with the following data: 有一个表,我需要从中得到一个包含以下数据的汇总表:
The rows are hours of the day, and the columns are the days, in the days of the total number of calls / missed / total duration of calls. 行是一天中的小时数,列是天数,在呼叫总数/错过/总呼叫持续时间内。
How to add additional columns (All, Lost, Time) in this table. 如何在此表中添加其他列(全部,丢失,时间)。 I have so far turned out only to calculate the total duration of calls per hour, and their total number. 到目前为止,我只计算出每小时通话的总持续时间及其总数。 Truth in different tables... 不同表格中的真相......
df.pivot_table(rows='Hour',cols='Date',aggfunc=len,fill_value=0)
df.pivot_table(rows='Hour',cols='Date',aggfunc=sum,fill_value=0)
IIUC you can do it this way: IIUC你可以这样做:
assuming we have the following DataFrame: 假设我们有以下DataFrame:
In [248]: df
Out[248]:
calldate billsec disposition
0 2016-11-01 00:05:26 185 ANSWERED
1 2016-11-01 00:01:26 0 NO ANSWER
2 2016-11-01 00:05:19 41 ANSWERED
3 2016-11-01 00:16:02 4 ANSWERED
4 2016-11-02 01:16:02 55 ANSWERED
5 2016-11-02 02:02:02 2 NO ANSWER
we can do the following: 我们可以做到以下几点:
funcs = {
'billsec': {
'all':'size',
'time':'sum'
},
'disposition': {
'lost': lambda x: (x == 'NO ANSWER').sum()
}
}
(df.assign(d=df.calldate.dt.strftime('%d.%m'), t=df.calldate.dt.hour)
.groupby(['t','d'])[['billsec','disposition']].agg(funcs)
.unstack('d', fill_value=0)
.swaplevel(axis=1)
.sort_index(level=[0,1], axis=1)
)
yields: 收益率:
d 01.11 02.11
all time lost all time lost
t
0 4 230 1 0 0 0
1 0 0 0 1 55 0
2 0 0 0 1 2 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.