[英]Applying different aggregate functions to different columns (now that dict with renaming is deprecated)
I had asked this question before: python pandas: applying different aggregate functions to different columns but the latest changes to pandas https://github.com/pandas-dev/pandas/pull/15931 mean that what I thought was an elegant and pythonic solution is deprecated, for reasons I genuinely fail to understand. 之前我曾问过这个问题: python pandas:将不同的聚合函数应用到不同的列,但对pandas https://github.com/pandas-dev/pandas/pull/15931的最新更改意味着我认为是优雅和pythonic解决方案已被弃用,原因我真的无法理解。
The question was, and still is: when doing a groupby, how can I apply different aggregate functions to different fields (eg sum of x, avg of x, min of y, max of z, etc.) and rename the resulting fields, all in one go, or at least in a possibly pythonic and not-too-cumbersome way? 问题是,现在仍然是:当进行groupby时,如何将不同的聚合函数应用于不同的字段(例如x的总和,x的平均值,y的最小值,z的最大值等)并重命名结果字段,一气呵成,或者至少以一种可能的pythonic而不是太麻烦的方式? Ie sum_x won't do, I need to rename the fields explicitly.
即sum_x不会,我需要显式重命名字段。
This approach, which I liked: 这种方法,我喜欢:
df.groupby('qtr').agg({"realgdp": {"mean_gdp": "mean", "std_gdp": "std"},
"unemp": {"mean_unemp": "mean"}})
will be deprecated and now produces this warning: 将被弃用,现在产生此警告:
FutureWarning: using a dict with renaming is deprecated and will be removed in a future version
Thanks! 谢谢!
agg() is not deprecated but renaming using agg is. 不推荐使用agg(),而是使用agg重命名。
Do go through the documentation: https://pandas.pydata.org/pandas-docs/stable/whatsnew.html#deprecate-groupby-agg-with-a-dictionary-when-renaming 请查看文档: https : //pandas.pydata.org/pandas-docs/stable/whatsnew.html#deprecate-groupby-agg-with-a-dictionary-when-renaming
What is deprecated: 1. Passing a dict to a grouped/rolled/resampled Series that allowed one to rename the resulting aggregation 2. Passing a dict-of-dicts to a grouped/rolled/resampled DataFrame. 不推荐使用的内容:1。将dict传递给分组/滚动/重采样系列,允许重命名生成的聚合2.将dict-of-dicts传递给分组/滚动/重采样的DataFrame。
This will work, though its not a single line of code 这将有效,但它不是一行代码
df.groupby('qtr').agg({"realgdp": ["mean", "std"], "unemp": "mean"})
df.columns = df.columns.map('_'.join)
df.rename(columns = {'realgdp_mean': 'mean_gdp', 'realgdp_std':'std_gdp', 'unemp_mean':'mean_unemp'}, inplace = True)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.