简体   繁体   English

熊猫在groupby中按条件过滤

[英]Pandas filter by condition in groupby

I have a matrix/dataframe with times and values: 我有一个带有时间和值的矩阵/数据框:

     # time             # Value
M = [[2018-08-08 12:00:00, 5],
     [2018-08-08 12:00:00, 7],
     [2018-08-08 13:00:00, 2],]

I want to group by hour, then calculate the mean value of the group, and then modify/reduce each group so that it has only values <= this mean. 我想按小时分组,然后计算组的平均值,然后修改/减少每个组,使其仅具有<=该平均值的值。

Current version: 当前版本:

grouped = M.groupby(pd.Grouper(key='time', freq='1h'))
means = grouped['value'].mean().values # np.array([6, 2])

Here I'm stuck. 在这里我被卡住了。 I get the mean values for each group. 我得到每个组的平均值。 But I don't know how to reduce the "grouped" so that the condition applies that grouped[grouped['value'] <= mean] for that group. 但是我不知道如何减少“分组”,以便该条件适用于该组的grouped [grouped ['value'] <= mean]。

Appreciate any suggestions. 感谢任何建议。


Expected output: 预期产量:

N = [[2018-08-08 12:00:00, 5], # as 5 <= 6 where 6 is the mean of the first group
     [2018-08-08 13:00:00, 2]] # as 2 is <= 2 where 2 is the mean of the second group

Use GroupBy.transform for Series with same size as original DataFrame filled by aggregated values, so boolean indexing working very nice: GroupBy.transform用于具有与原始DataFrame相同大小的Series ,并由聚合值填充,因此boolean indexing工作得非常好:

M = [['2018-08-08 12:00:00', 5],
     ['2018-08-08 12:00:00', 7],
     ['2018-08-08 13:00:00', 2]]

M = pd.DataFrame(M, columns=['time','value'])
M['time'] = pd.to_datetime(M['time'])
print (M)
                 time  value
0 2018-08-08 12:00:00      5
1 2018-08-08 12:00:00      7
2 2018-08-08 13:00:00      2

s = M.groupby(pd.Grouper(key='time', freq='1h'))['value'].transform('mean')
print (s)
0    6
1    6
2    2
Name: value, dtype: int64

mean = 5
df = M[s <= mean]
print (df)
                 time  value
2 2018-08-08 13:00:00      2

EDIT: 编辑:

You can also compare by columns values: 您还可以按列值进行比较:

df1 = M[M['value'] <= s]
print (df1)
                 time  value
0 2018-08-08 12:00:00      5
2 2018-08-08 13:00:00      2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM