[英]Pandas rolling window selection based on a condition and calculate
How can I calculate rolling window mean based on a condition?如何根据条件计算滚动 window 平均值? Need to calculate rolling window mean where for each index, I capture coordinate difference within a range < 400.需要计算滚动 window 意味着对于每个索引,我在 < 400 范围内捕获坐标差异。
I need to add this as a new column.我需要将其添加为新列。
eg例如
at Index
cg13869341 = mean(cg13869341, cg14008030)
cg14008030 = mean(cg13869341, cg14008030)
cg14008031 = mean(cg13869341)
...
cg14008033 = mean(cg14008031,cg40826798, cg14008034, cg40826792)
....
cg40826792 = mean(cg60826792, cg47454306, cg14008034, cg14008033, cg40826792)
Example dataset示例数据集
Index coordinate rolling_mean
cg13869341 100
cg14008030 200
cg14008031 800
cg40826798 900
cg14008033 1000
cg14008034 1050
cg40826792 1250
cg47454306 1500
With the dataframe you provided:使用您提供的 dataframe:
import pandas as pd
df = pd.DataFrame(
{
"index": [
"cg13869341",
"cg14008030",
"cg14008031",
"cg40826798",
"cg14008033",
"cg14008034",
"cg40826792",
"cg47454306",
],
"coordinate": [100, 200, 800, 900, 1000, 1050, 1250, 1500],
}
)
Here is one way to do it using Pandas apply :这是使用 Pandas apply执行此操作的一种方法:
df["rolling_mean"] = df.apply(
lambda x: df.loc[
(df["coordinate"] >= x["coordinate"] - 400)
& (df["coordinate"] <= x["coordinate"] + 400),
"coordinate",
].mean(),
axis=1,
)
Then:然后:
print(df)
# Output
index coordinate rolling_mean
0 cg13869341 100 150.0
1 cg14008030 200 150.0
2 cg14008031 800 937.5
3 cg40826798 900 1000.0
4 cg14008033 1000 1000.0
5 cg14008034 1050 1000.0
6 cg40826792 1250 1140.0
7 cg47454306 1500 1375.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.