I have a pandas dataframe that I'd like to sum on a rolling basis where the window is specified by another column.
For example,
values_to_sum | window_size | rolling_sum |
---|---|---|
1 | 6 | 17 |
2 | 5 | 16 |
1 | 2 | 4 |
3 | 5 | 19 |
4 | 5 | NaN |
6 | 4 | NaN |
2 | 3 | NaN |
4 | 3 | NaN |
Trying to call the column window_size
within the rolling function results in the error ValueError: window must be an integer
.
How can I call the column window_size
on a row-by-row basis for the rolling function?
With a list comprehension:
df["rolling_sum"] = [np.nan
if j + ws > len(df.index)
else df.values_to_sum.iloc[j: j+ws].sum()
for j, ws in enumerate(df.window_size)]
Put np.nan
if the current index ( j
) plus window size ( ws
) exceeds the dataframe's length ( len(df.index)
); else get the window with iloc
and sum
it.
to get
values_to_sum window_size rolling_sum
0 1 6 17.0
1 2 5 16.0
2 1 2 4.0
3 3 5 19.0
4 4 5 NaN
5 6 4 NaN
6 2 3 NaN
7 4 3 NaN
note: you can pre-define df_length = len(df.index)
and use it to avoid looking for its length in the comprehension repeatedly.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.