有没有更快的方法来做这个循环？

Question

I want to create a new column using the following loop.我想使用以下循环创建一个新列。 The table just has the columns 'open', and 'start'.该表只有“打开”和“开始”列。 I want to create a new column 'startopen', where if 'start' equals 1, then 'startopen' is equal to 'open'.我想创建一个新列'startopen'，如果'start'等于1，那么'startopen'等于'open'。 Otherwise, 'startopen' is equal to whatever 'startopen' was in the row above of this newly created column.否则，'startopen' 等于这个新创建的列上方行中的任何 'startopen'。 Currently I'm able to achieve this using the following:目前我可以使用以下方法实现这一点：

for i in range(df.shape[0]):
    if df['start'].iloc[i] == 1:
        df.loc[df.index[i],'startopen'] = df.loc[df.index[i],'open']
    else:
        df.loc[df.index[i],'startopen'] = df.loc[df.index[i-1],'startopen']

This works, but is very slow for large datasets.这可行，但对于大型数据集来说非常慢。 Are there any built in functions that can do this faster?是否有任何内置函数可以更快地做到这一点？

Answer 1

I want to create a new column 'startopen', where if 'start' equals 1, then 'startopen' is equal to 'open'我想创建一个新列'startopen'，如果'start'等于1，那么'startopen'等于'open'

Otherwise, 'startopen' is equal to whatever 'startopen' was in the row above of this newly created column.否则，'startopen' 等于这个新创建的列上方行中的任何 'startopen'。

IIUC, otherwise part is equal to forward fill the not 1 startopen with last equal 1 startopen IIUC，否则部分等于前向填充非 1 startopen与 last 等于 1 startopen

df['startopen'] = pd.Series(np.where(df['start'].eq(1), df['open'], np.nan), index=df.index).ffill()

有没有更快的方法来做这个循环？

问题描述

1 个解决方案

解决方案1
2 2022-05-28 05:52:29

有没有更快的方法来做这个循环？

问题描述

1 个解决方案

解决方案1 2 2022-05-28 05:52:29

解决方案1
2 2022-05-28 05:52:29