简体   繁体   中英

Checking for subset in a column?

I'm trying to flag some price data as "stale" if the quoted price of the security hasn't changed over lets say 3 trading days. I'm currently trying it with:

firm["dev"] = np.std(firm["Price"],firm["Price"].shift(1),firm["Price"].shift(2))
firm["flag"] == np.where(firm["dev"] = 0, 1, 0)

But I'm getting nowhere with it. This is what my dataframe would look like.

Index Price Flag
1 10 0
2 11 0
3 12 0
4 12 0
5 12 1
6 11 0
7 13 0

Any help is appreciated!

If you are okay with other conditions, you can first check if series.diff equals 0 and take cumsum to check if you have a cumsum of 2 (n-1). Also check if the next row is equal to current, when both these conditions suffice, assign a flag of 1 else 0.

n=3
firm['Flag'] = (firm['Price'].diff().eq(0).cumsum().eq(n-1) & 
                firm['Price'].eq(firm['Price'].shift())).astype(int)

EDIT, to make it a generalized function with consecutive n, use this:

def fun(df,col,n):
    c = df[col].diff().eq(0)
    return (c|c.shift(-1)).cumsum().ge(n) & df[col].eq(df[col].shift())

firm['flag_2'] = fun(firm,'Price',2).astype(int)
firm['flag_3'] = fun(firm,'Price',3).astype(int)

print(firm)

       Price  Flag  flag_2  flag_3
Index                             
1         10     0       0       0
2         11     0       0       0
3         12     0       0       0
4         12     0       1       0
5         12     1       1       1
6         11     0       0       0
7         13     0       0       0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM