[英]Create rolling windows in pandas based on window size specified in another column
I have a pandas dataframe that I'd like to sum on a rolling basis where the window is specified by another column.我有一个 pandas dataframe 我想滚动求和,其中 window 由另一列指定。
For example,例如,
values_to_sum ![]() |
window_size![]() |
rolling_sum![]() |
---|---|---|
1 ![]() |
6 ![]() |
17 ![]() |
2 ![]() |
5 ![]() |
16 ![]() |
1 ![]() |
2 ![]() |
4 ![]() |
3 ![]() |
5 ![]() |
19 ![]() |
4 ![]() |
5 ![]() |
NaN![]() |
6 ![]() |
4 ![]() |
NaN![]() |
2 ![]() |
3 ![]() |
NaN![]() |
4 ![]() |
3 ![]() |
NaN![]() |
Trying to call the column window_size
within the rolling function results in the error ValueError: window must be an integer
.尝试在滚动 function 中调用列
window_size
会导致错误ValueError: window must be an integer
。
How can I call the column window_size
on a row-by-row basis for the rolling function?对于滚动 function,如何逐行调用列
window_size
?
With a list comprehension:使用列表理解:
df["rolling_sum"] = [np.nan
if j + ws > len(df.index)
else df.values_to_sum.iloc[j: j+ws].sum()
for j, ws in enumerate(df.window_size)]
Put np.nan
if the current index ( j
) plus window size ( ws
) exceeds the dataframe's length ( len(df.index)
);如果当前索引(
j
)加上np.nan
大小( ws
)超过数据帧的长度( len(df.index)
),则放置 np.nan ; else get the window with iloc
and sum
it.否则用 iloc 得到
iloc
并sum
。
to get要得到
values_to_sum window_size rolling_sum
0 1 6 17.0
1 2 5 16.0
2 1 2 4.0
3 3 5 19.0
4 4 5 NaN
5 6 4 NaN
6 2 3 NaN
7 4 3 NaN
note: you can pre-define df_length = len(df.index)
and use it to avoid looking for its length in the comprehension repeatedly.注意:您可以预先定义
df_length = len(df.index)
并使用它来避免在理解中重复寻找它的长度。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.