簡體   English   中英

Pandas 滾動 window 根據條件選擇並計算

[英]Pandas rolling window selection based on a condition and calculate

如何根據條件計算滾動 window 平均值? 需要計算滾動 window 意味着對於每個索引,我在 < 400 范圍內捕獲坐標差異。

我需要將其添加為新列。

例如

at Index 
cg13869341 = mean(cg13869341, cg14008030)
cg14008030 = mean(cg13869341, cg14008030) 
cg14008031 = mean(cg13869341)  
...
cg14008033 = mean(cg14008031,cg40826798, cg14008034, cg40826792)
....        
cg40826792 = mean(cg60826792, cg47454306, cg14008034, cg14008033, cg40826792)

示例數據集

Index       coordinate   rolling_mean
cg13869341  100         
cg14008030  200         
cg14008031  800         
cg40826798  900         
cg14008033  1000        
cg14008034  1050            
cg40826792  1250            
cg47454306  1500

使用您提供的 dataframe:

import pandas as pd

df = pd.DataFrame(
    {
        "index": [
            "cg13869341",
            "cg14008030",
            "cg14008031",
            "cg40826798",
            "cg14008033",
            "cg14008034",
            "cg40826792",
            "cg47454306",
        ],
        "coordinate": [100, 200, 800, 900, 1000, 1050, 1250, 1500],
    }
)

這是使用 Pandas apply執行此操作的一種方法:

df["rolling_mean"] = df.apply(
    lambda x: df.loc[
        (df["coordinate"] >= x["coordinate"] - 400)
        & (df["coordinate"] <= x["coordinate"] + 400),
        "coordinate",
    ].mean(),
    axis=1,
)

然后:

print(df)
# Output
        index  coordinate  rolling_mean
0  cg13869341         100         150.0
1  cg14008030         200         150.0
2  cg14008031         800         937.5
3  cg40826798         900        1000.0
4  cg14008033        1000        1000.0
5  cg14008034        1050        1000.0
6  cg40826792        1250        1140.0
7  cg47454306        1500        1375.0

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM