简体   繁体   English

在 pandas 中滚动 function 有条件

[英]Rolling function in pandas with condition

I have a dataframe with the following structure:我有一个具有以下结构的 dataframe:

import numpy as np
import pandas as pd

df = pd.DataFrame(
    {
        "date": ["2020-01-01", "2020-01-02", "2020-01-03", "2020-01-04"] * 2,
        "group": ["A", "A", "A", "A", "B", "B", "B", "B"],
        "x": [1, 2, 2, 3, 2, 3, 4, 2],
        "condition": [1, 0, 1, 0] * 2
    }
)
df

I want to calculate, the rolling average of the column x:我想计算列 x 的滚动平均值:

  • Per group每组
  • Using only past data (not using the current row)仅使用过去的数据(不使用当前行)
  • Using only data for the rolling average where condition = 1 .仅使用condition = 1的滚动平均值的数据。

The outcome should be the following:结果应如下所示:

在此处输入图像描述

How can I do that in pandas?我怎样才能在 pandas 中做到这一点? Thanks!谢谢!

I think we should filter the dataframe on conditions and then calculate the mean of x我认为我们应该根据条件过滤 dataframe,然后计算 x 的平均值

  • group == group of current row group == 当前行的组
  • date < date of current row date < 当前行的日期
  • condition == 1条件 == 1

df.apply is used to apply to all rows of the dataframe df.apply用于应用到 dataframe 的所有行

df['rolling_avg_x'] = df.apply(lambda x: df[(df.group == x.group) & (df.date < x.date) & (df.condition == 1)].x.mean(), axis=1)

This will give you the output as desire这会给你 output 作为愿望

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM