简体   繁体   中英

How to subset row of condition with some of N rows before the condition meet , more faster than my code?

Since my data set is time series where I have 30 different data frame and each of data frame have more than 10,000 number of rows. I want to examine, the trend before the temperature value goes below 40.

So, I want to subset row when the temperature value is below than 40 and I also want to subset 24 rows before the value become below 40.

I already try some code, the only code that working is below. But it take longer time to subset(like more than 10 minutes for one data frame). So, my code is bad. So I want to know code in python that can subset faster. Can you guys help me?

df=temperature_df.copy()
drop_temperature_df=pd.DataFrame()

# get the index during drop temperature
drop_temperature_index=np.array(df[df[temperature]<40].index)

# subset the data frame for 24 hours before drop temperature
for i,index in enumerate(drop_temperature_index):
    drop_temperature_df=drop_temperature_df.append(df.loc[index-24:index,:])

K['K_{}'.format(string)]=drop_temperature_df.copy() #save the subset data frame

So like data below, I have temperature point below 40 at 1/26/2018 0800 So, I want to subset the point below 40 with 24 rows before (1/25/2018 0800 until 1/26/2018 0800).

在此输入图像描述

我认为你可以使用ffill with limit ,并找到notnull index ,切片数据帧

yourdf=df[df.temperature.where(df.temperature<40).bfill(limit=24).notnull()].copy()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM