I'm trying to get interpolated values for the metric shown below using Pandas time series.
test.csv
year,metric
2020,290.72
2025,221.763
2030,152.806
2035,154.016
Code
import pandas as pd
df = pd.read_csv('test.csv', parse_dates={'Timestamp': ['year']},
index_col='Timestamp')
As far as I understand this gives me an time series with the January 1 of each year as the index. Now I need to fill in values for missing years (2021, 2022, 2023, 2024, 2026 etc)
Is there a way to do this with Pandas?
如果您使用的是较新版本的Pandas,则DataFrame对象应该具有一个可用于填补空白的插值方法。
It turns out, interpolation only fills in values, where there are none. In my case above, what I had to do was to re-index so that the interval was 12 months.
# reindex with interval 12 months (M: month, S: beginning of the month)
df_reindexed = df.reindex(pd.date_range(start='20120101', end='20350101', freq='12MS'))
# method=linear works because the intervals are equally spaced out now
df_interpolated = df_reindexed.interpolate(method='linear')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.