简体   繁体   English

pandas 在 pivot_table 期间聚合多个列

[英]pandas aggregate multiple columns during pivot_table

Have a dataframe like this:有一个 dataframe 像这样:

df = pd.DataFrame((['1990-01-01','A','S1','2','string2','string3'],
 ['1990-01-01','A','S2','1','string1','string4'],
 ['1990-01-01','A','S3','1','string5','string6']),columns= 
 ["date","type","status","count","s1","s2")


           date type status count       s1       s2
 0  1990-01-01    A     S1     2  string2  string3
 1  1990-01-01    A     S2     1  string1  string4
 2  1990-01-01    A     S3     1  string5  string6
 ...

I want to get bellow result (each date and each type should have single row, and get the min of s1 column, get max of s2 column)我想得到以下结果(每个日期和每种类型都应该有单行,并获取 s1 列的最小值,获取 s2 列的最大值)

date             type       S1    S2   S3    min_s1        max_s2
1990-01-01       A           2     1   1     string1      string6

I tried to use pivot_table我尝试使用pivot_table

df.pivot_table(index=['date','type'],columns=['status'],values=['count','s1','s2'], aggfunc={
'count':np.sum, 
's1': np.min,
's2': np.max
})

But this would only get bellow result, which leads to multiple columns and not the final result.但这只会得到下面的结果,这会导致多列而不是最终结果。

                count             s1                         s2
status             S1 S2 S3       S1       S2       S3       S1       S2       S3
date       type
1990-01-01 A        2  1  1  string2  string1  string5  string3  string4  string6

Anyone idea?有人知道吗? Thanks.谢谢。

Looks like you want to combine a pivot and groupby.agg :看起来你想结合pivotgroupby.agg

(df.pivot(index=['date','type'],columns='status', values='count')
   .join(df.groupby(['date', 'type']).agg({'s1': 'min', 's2': 'max'}))
   .reset_index()
)

output: output:

         date type S1 S2 S3       s1       s2
0  1990-01-01    A  2  1  1  string1  string6

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM