简体   繁体   English

将函数或Lambda应用于熊猫GROUPBY

[英]Apply Function or Lambda to Pandas GROUPBY

I would like to apply a specific function (in this case a logit model) to a dataframe which can be grouped (by the variable "model"). 我想将特定功能(在本例中为logit模型)应用于可以分组(通过变量“ model”)的数据框。 I know the task can be performed through a loop, however I believe this to be inefficient at best. 我知道可以通过循环执行任务,但是我认为这充其量是无效的。 Example code below: 下面的示例代码:

import pandas as pd
import numpy as np
import statsmodels.api as sm
df1=pd.DataFrame(np.random.randint(0,100,size=(100,10)),columns=list('abcdefghij'))
df2=pd.DataFrame(np.random.randint(0,100,size=(100,10)),columns=list('abcdefghij'))
df1['model']=1
df1['target']=np.random.randint(2,size=100)
df2['model']=2
df2['target']=np.random.randint(2,size=100)
data=pd.concat([df1,df2])
### Clunky, but works...  
for i in range(1,2+1):
    lm=sm.Logit(data[data['model']==i]['target'],
                sm.add_constant(data[data['model']==i].drop(['target'],axis=1))).fit(disp=0)
    print(lm.summary2())
### Can this work?  
def elegant(self):
    lm=sm.Logit(data['target'],
                sm.add_constant(data.drop(['target'],axis=1))).fit(disp=0)
better=data.groupby(['model']).apply(elegant)

If the above groupby can work, is this a more efficient way to perform than looping? 如果上面的groupby可以工作,这是比循环更有效的执行方法吗?

This could work: 这可以工作:

def elegant(df):
lm = sm.Logit(df['target'],
              sm.add_constant(df.drop(['target'],axis=1))).fit(disp=0)
return lm 

better = data.groupby('model').apply(elegant)

Using .apply you passe the dataframe groups to the function elegant so elegant has to take a dataframe as the first argument here. 使用.apply你过时数据框组功能elegant如此elegant有采取数据帧作为这里的第一个参数。 Also your function needs to return the result of your calculation lm . 此外,您的函数还需要返回计算结果lm

For more complexe functions the following structure can be used: 对于更复杂的功能,可以使用以下结构:

def some_fun(df, kw_param=1):
# some calculations to df using kw_param
return df

better = data.groupby('model').apply(lambda group: some_func(group, kw_param=99))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM