简体   繁体   English

Pandas groupby 计数值高于阈值

[英]Pandas groupby count values above threshold

I have a groupby question that I can't solve.我有一个无法解决的 groupby 问题。 It is probably simple, but I can't get it to work nicely.它可能很简单,但我无法让它很好地工作。 I am trying to compute some statistics on a variable with pandas groupby chained with the very handy agg function. I would like add to the list below a calculation of the number of values above a given threshold.我正在尝试计算一个变量的一些统计数据,其中 pandas groupby 与非常方便的 agg function 链接在一起。我想在下面的列表中添加一个计算高于给定阈值的值的数量。

df = df.groupby(['scenario','Name','year','month'])["Value"].agg([np.min,np.max,np.mean,np.std]) 

Usually, I compute the number of values above a given threshold as shown below, but I can't find a way to add this to the aggregation function. Do you know how I could do that?通常,我会计算高于给定阈值的值的数量,如下所示,但我找不到将其添加到聚合 function 的方法。你知道我该怎么做吗?

df =df[df>0].groupby(['scenario','Name','year','month']).count()

Your answer works.你的答案有效。 Else you could add it to the one line, not needing to create a separate function by using lambda x: instead.否则,您可以将它添加到一行,而无需使用lambda x:创建单独的 function。

df = df.groupby(["scenario", "Name", "year", "month"])["Value"].agg([np.min, np.max, np.mean, np.std, lambda x: ((x > 0)*1).sum()])

The logic here: (x > 0) returns True/False bool;这里的逻辑: (x > 0)返回 True/False bool; *1 turns the bool to an integer (1 = True, 0 = False); *1将布尔值转换为 integer(1 = True,0 = False); .sum() will sum all the 1s and 0s within the group - and as those that are True = 1, the sum will count all values greater than 0. .sum()将对组内的所有 1 和 0 求和 - 由于 True = 1,总和将计算所有大于 0 的值。

Running a quick test on the time taken, your solution is faster, but I thought I would give an alternative solution anyway.对所用时间进行快速测试,您的解决方案更快,但我想我还是会提供替代解决方案。

I found a solution by creating a function and passing it in the agg function.我通过创建一个 function 并将其传递到 agg function 中找到了解决方案。

def counta(x):
    m = np.count_nonzero(x > 10)
    return m 

df = df.groupby(['scenario','Name','year','month'])["Value"].agg([np.min,np.max,np.mean,np.std,counta])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM