简体   繁体   English

简化 pandas groupby().agg() 的代码

[英]simplify codes for pandas groupby().agg()

I have a dataframe that I'm trying to group by and get sum for multiple columns, for which I have below code:我有一个 dataframe 我正在尝试分组并获取多列的总和,为此我有以下代码:

df=df.groupby(['year','month']).agg({'A':['sum'],'B':['sum'],'C':['sum'],'D':['sum']})

Is there a way I could change the arguments in agg() to iterate through a list?有没有办法可以更改 agg() 中的 arguments 以遍历列表? I'm trying something like this, but obviously it's not working.我正在尝试这样的事情,但显然它不起作用。

col=['A','B','C','D']
df=df.groupby(['year','month']).agg({c for c in col})

Thank you very much!非常感谢!

You are very close.你很亲密。 Note you are passing into agg() a set, not a dictionary.请注意,您正在向agg()传递一个集合,而不是字典。 A dictionary has a pair of key: value where you just have a value .字典有一对key: value你只有一个value

df=df.groupby(['year','month']).agg({c: ['sum'] for c in df.columns})

Because:因为:

{c: ['sum'] for c in df.columns}
>>> {'A':['sum'],'B':['sum'],'C':['sum'],'D':['sum']}

In contrast to what you wrote:与你写的相反:

{c for c in df.columns}  # you iterated over cols, probably forgot cols=df.columns before. Changed it to df.columns here
>>> {'A', 'B', 'C', 'D'}

Edit: I'm also assuming you are not interested in summing all your columns, only 'A' to 'D'.编辑:我还假设您对汇总所有列不感兴趣,只有“A”到“D”。 If it is in fact your intention, like stated in other comments under your question, you can just do:如果这实际上是您的意图,就像您问题下的其他评论中所述,您可以这样做:

df.groupby(['year','month']).sum()

Or或者

df.groupby(['year','month']).agg('sum')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM