简体   繁体   中英

interpolate values between sample years with Pandas

I'm trying to get interpolated values for the metric shown below using Pandas time series.

test.csv

year,metric
2020,290.72
2025,221.763
2030,152.806
2035,154.016

Code

import pandas as pd
df = pd.read_csv('test.csv', parse_dates={'Timestamp': ['year']},
                    index_col='Timestamp')

As far as I understand this gives me an time series with the January 1 of each year as the index. Now I need to fill in values for missing years (2021, 2022, 2023, 2024, 2026 etc)

Is there a way to do this with Pandas?

如果您使用的是较新版本的Pandas,则DataFrame对象应该具有一个可用于填补空白的插值方法。

It turns out, interpolation only fills in values, where there are none. In my case above, what I had to do was to re-index so that the interval was 12 months.

# reindex with interval 12 months (M: month, S: beginning of the month)
df_reindexed = df.reindex(pd.date_range(start='20120101', end='20350101', freq='12MS'))

# method=linear works because the intervals are equally spaced out now
df_interpolated = df_reindexed.interpolate(method='linear')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM