简体   繁体   English

如何在 groupby 中使用具有多个参数的函数

[英]How to use functions with several paramiters in a groupby

I have the following dataset for which I want to calculate several aggregation metrics>我有以下数据集,我想为其计算几个聚合指标>

在此处输入图像描述

For some I'm using the standard functions, but for other I relay on the tsfresh library , from where I'm importing the functions:对于一些我使用标准函数,但对于其他我中继tsfresh library ,我从中导入函数:

sample.groupby('id').agg(['std', benford_correlation,absolute_maximum])

It works well for functions that have only one parameter, as is the case of:它适用于只有一个参数的函数,例如:

from tsfresh.feature_extraction.feature_calculators import benford_correlation #(x)
from tsfresh.feature_extraction.feature_calculators import absolute_maximum #(x)

But for others like:但对于其他人来说:

from tsfresh.feature_extraction.feature_calculators import autocorrelation#(x, lag)从 tsfresh.feature_extraction.feature_calculators 导入自相关#(x, lag)

在此处输入图像描述

I get and error since it has two parameters, x and lag by I'm only passing the x implicitly in the groupby.我得到错误,因为它有两个参数,x 和 lag by 我只是在 groupby 中隐式传递 x。

How can I specify the other parameters required?如何指定所需的其他参数?

see the pandas.DataFrameGroupBy.aggregate docs.请参阅pandas.DataFrameGroupBy.aggregate文档。 Additional keyword arguments are passed to the function.附加关键字 arguments 被传递给 function。 So you can do this:所以你可以这样做:

sample.groupby('id').agg(
    ['std', benford_correlation,absolute_maximum],
    additional_arg=value,
)

but if you need to pass different arguments to each function, you could use a lambda function:但如果您需要将不同的 arguments 传递给每个 function,您可以使用 lambda ZC1C425268E683894F11AB5A

sample.groupby('id').agg(
    [
        'std',
        lambda s: benford_correlation(s, lag=1),
        absolute_maximum,
    ],
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM