[英]pandas aggregate multiple columns during pivot_table
Have a dataframe like this:有一个 dataframe 像这样:
df = pd.DataFrame((['1990-01-01','A','S1','2','string2','string3'],
['1990-01-01','A','S2','1','string1','string4'],
['1990-01-01','A','S3','1','string5','string6']),columns=
["date","type","status","count","s1","s2")
date type status count s1 s2
0 1990-01-01 A S1 2 string2 string3
1 1990-01-01 A S2 1 string1 string4
2 1990-01-01 A S3 1 string5 string6
...
I want to get bellow result (each date and each type should have single row, and get the min of s1 column, get max of s2 column)我想得到以下结果(每个日期和每种类型都应该有单行,并获取 s1 列的最小值,获取 s2 列的最大值)
date type S1 S2 S3 min_s1 max_s2
1990-01-01 A 2 1 1 string1 string6
I tried to use pivot_table
我尝试使用
pivot_table
df.pivot_table(index=['date','type'],columns=['status'],values=['count','s1','s2'], aggfunc={
'count':np.sum,
's1': np.min,
's2': np.max
})
But this would only get bellow result, which leads to multiple columns and not the final result.但这只会得到下面的结果,这会导致多列而不是最终结果。
count s1 s2
status S1 S2 S3 S1 S2 S3 S1 S2 S3
date type
1990-01-01 A 2 1 1 string2 string1 string5 string3 string4 string6
Anyone idea?有人知道吗? Thanks.
谢谢。
Looks like you want to combine a pivot
and groupby.agg
:看起来你想结合
pivot
和groupby.agg
:
(df.pivot(index=['date','type'],columns='status', values='count')
.join(df.groupby(['date', 'type']).agg({'s1': 'min', 's2': 'max'}))
.reset_index()
)
output: output:
date type S1 S2 S3 s1 s2
0 1990-01-01 A 2 1 1 string1 string6
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.