[英]pandas groupby and apply function on multiple columns
If I have a function f
that I am applying to more than once to a set of columns, what's a more Pythonic way of going about it. 如果我有一个函数
f
,我不止一次应用于一组列,那么更多的Pythonic方法是什么。 Right now, what I am doing is this. 现在,我正在做的是这个。
newdf=df.groupby(['a', 'b']).apply(lambda x: f(x, 1))
newdf.columns=['1']
newdf['2']=df.groupby(['a', 'b']).apply(lambda x: f(x, 2))
newdf['3']=df.groupby(['a', 'b']).apply(lambda x: f(x, 3))
newdf['4']=df.groupby(['a', 'b']).apply(lambda x: f(x, 4))
Is there a better way of going about it? 有更好的方法吗?
Thanks, 谢谢,
That's pythonic enough for me: 这对我来说足够pythonic:
columns_dict = dict()
for i in range(1, 5):
columns_dict[str(i)] = df.groupby(["a", "b"]).apply(lambda x: f(x, i))
pd.DataFrame(columns_dict)
You could do : 你可以这样做:
pandas.DataFrame([df.groupby(['a','b']).apply(lambda x : f(x,i)) for i in range(1,5)])
Then transpose the new DataFrame if you want to have same column names as the initial dataframe. 如果要使用与初始数据帧相同的列名,则转置新的DataFrame。
Use agg()
to compute multiple values from a single groupby()
: 使用
agg()
从单个groupby()
计算多个值:
df.groupby(['a', 'b']).agg([
('1': lambda x: f(x, 1)),
('2': lambda x: f(x, 2)),
('3': lambda x: f(x, 3)),
('4': lambda x: f(x, 4)),
])
Or equivalently: 或等效地:
df.groupby(['a', 'b']).agg([(str(i), lambda x: f(x, i)) for i in range(1, 5)])
Pandas groupby.apply
accepts arbitrary arguments and keyword arguments, which are passed on to the grouping function. Pandas
groupby.apply
接受任意参数和关键字参数,这些参数传递给分组函数。 In addition, you can create a dictionary mapping column to argument. 此外,您可以创建字典映射列到参数。 Finally, you can also reuse a
groupby
object, which can be defined outside your loop. 最后,您还可以重用
groupby
对象,该对象可以在循环外部定义。
argmap = {'2': 2, '3': 3, '4': 4}
grouper = df.groupby(['a', 'b'])
for k, v in argmap.items():
newdf[k] = grouper.apply(f, v)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.