简体   繁体   中英

How to populate missing row with previous or next row in pandas python

I have sample data like this.

date        time option_type   open    high     low   close  volume

6031    9/27/2018   09:17   CE  11500   0.15    0.15    0.15    0.15    1500


6131    9/27/2018   15:19   CE  11500   0.05    0.05    0.05    0.05    1500
6132    9/27/2018   15:22   CE  11500   0.05    0.05    0.05    0.05    75
6133    9/27/2018   15:24   CE  11500   0.05    0.05    0.05    0.05    225
6134    9/27/2018   15:25   CE  11500   0.05    0.05    0.05    0.05    75
6135    9/27/2018   15:26   CE  11500   0.05    0.05    0.05    0.05    600

Some of the rows are missing over there. For example 09:15, 09:16 then 15:20, 15:21 etc.

I want to populate missing rows with previous row value in case of 15:20/15:21 and next available row in case of 09:15/09:16. 09:17 values will be used for 09:15/09:16. 15:20 values will be used for 15:21/15:22.

could you please help me on this. Thanks in advance and appreciate your efforts and time.

Step 1: Finding difference in time on consecutive rows:

df['deltaT'] = df.time.to_series().diff().dt.seconds.div(60, fill_value=0)

The above will give you a new column on how many mins diff. is between the consecutive rows

Step2: Replicate rows based on new column deltaT

df.reindex(df.index.repeat(df.deltaT))

Step3: Building logic to increment time column

df['time'] = pd.to_timedelta(df['time']) + pd.to_timedelta(df['deltaT'], unit='m')

Still struggling to give you last part.

If you find this helpful and can build upon after this.Great!!

I think you are looking for something like this :

df['time']=df['time'].fillna(method="ffill") #to carry the values forward
df['time']=df['time'].fillna(method="bfill") #to carry the values backwards

df

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM