简体   繁体   中英

How to save previous row index based on some condition in pandas

I am trying to figure out how to save the previous row index/date based on some condition.

The speed of this operation is crucial, so I am trying to use vectorized operation, but with no success so far.

For example, I have this dataframe:

dates = pd.date_range('1/1/2000', periods=10)
data = {'date': dates}
df = pd.DataFrame.from_dict(data)

df['condition'] = [False, False, True, True, False, True, False, False, True, False]
df['desired_result'] = [np.nan, np.nan,np.nan, df.iloc[2]['date'], np.nan, df.iloc[3]['date'], np.nan, np.nan, df.iloc[5]['date'], np.nan]

     date              condition    desired_result 
[0: 2000-01-01 00:00:00 False          NaT], 
[1: 2000-01-02 00:00:00 False          NaT],
[2: 2000-01-03 00:00:00 True           NaT], 
[3: 2000-01-04 00:00:00 True           2000-01-03 00:00:00], 
[4: 2000-01-05 00:00:00 False          NaT], 
[5: 2000-01-06 00:00:00 True           2000-01-04 00:00:00],
[6: 2000-01-07 00:00:00 False          NaT,
[7: 2000-01-08 00:00:00 False          NaT,
[8: 2000-01-09 00:00:00 True           2000-01-06 00:00:00],
[9: 2000-01-10 00:00:00 False          NaT],

I have a problem "saving" previous valid row, due to the lack of knowledge. How can I manage to do that?

The following should work:

dates = pd.date_range('1/1/2000', periods=10)
data = {'date': dates}
df = pd.DataFrame.from_dict(data)

df['condition'] = [False, False, True, True, False, True, False, False, True, False]

df['desired_result']=pd.NaT
df2=df[df['condition']==True]
df3=df[df['condition']!=True]
df2.desired_result=df2.date.shift(1)
result=pd.concat([df2,df3]).sort_index()
print(result)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM