is_avail valu data_source
2015-08-07 False 0.282 source_a
2015-08-23 False 0.296 source_a
2015-09-08 False 0.433 source_a
2015-10-01 True 0.169 source_a
2015-10-10 True 0.160 source_a
2015-11-02 False 0.179 source_a
2016-03-09 False 0.557 source_a
2016-04-26 False 0.770 source_a
2016-05-05 False 0.826 source_a
2016-05-12 False 0.826 source_a
2016-05-28 False 0.747 source_a
2016-06-06 False 0.796 source_a
2016-07-31 False 0.322 source_a
2016-08-25 True 0.136 source_a
2016-09-10 False 0.180 source_a
2016-11-13 False 0.492 source_a
2016-12-15 True 0.124 source_a
2016-12-31 False 0.533 source_a
2017-03-28 False 0.524 source_a
2015-06-27 True 0.038 source_b
2015-07-30 True 0.035 source_b
2015-08-06 False 0.205 source_b
2015-08-09 False 0.241 source_b
2015-08-16 True 0.025 source_b
2015-08-19 True 0.092 source_b
2015-08-26 False 0.264 source_b
2015-08-29 False 0.312 source_b
The above dataframe has an index of datetime objects. I want to add rows for dates which are currently missing in the dataframe. However, I want to add those rows separately for source_a
and source_b
. Eg 2015-08-08
is a missing date for both source_a
and source_b
so I want to add that in the dataframe for both of them. How can I do that?
You can use resample
in a groupby
and ffill
(forward fill)
df.groupby(
'data_source', group_keys=False
).apply(lambda df: df.resample('D').ffill())
Or you can interpolate
for valu
and ffill
the rest.
df.groupby(
'data_source', group_keys=False
).apply(
lambda df: df.resample('D').interpolate('index').ffill()
)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.