Python / Pandas：用顺序填充 NaN - 线性插值 --> ffill --> bfill

Question

I have a df :我有一个df ：

     company  year      revenues
0  company 1  2019   1,425,000,000
1  company 1  2018   1,576,000,000
2  company 1  2017   1,615,000,000
3  company 1  2016   1,498,000,000
4  company 1  2015   1,569,000,000
5  company 2  2019             nan
6  company 2  2018   1,061,757,075
7  company 2  2017             nan
8  company 2  2016     573,414,893
9  company 2  2015     599,402,347

I would like to fill the nan values, with an order.我想用订单fill nan值。 I want to linearly interpolate first, then forward fill and then backward fill.我想先线性插值，然后是前向填充，然后是后向填充。 I currently have:我目前有：

f_2_impute = [x for x in cl_data.columns if cl_data[x].dtypes != 'O' and 'total' not in x and 'year' not in x]

def ffbf(x):
    return x.ffill().bfill()

group_with = ['company']

for x in cl_data[f_2_impute]:
    cl_data[x] = cl_data.groupby(group_with)[x].apply(lambda fill_it: ffbf(fill_it))

which performs ffill() and bfill() .它执行ffill()和bfill() 。 Ideally I want a function that tries first to linearly intepolate the missing values, then try forward filling them and then backward filling them.理想情况下，我想要一个 function 尝试首先线性插入缺失值，然后尝试向前填充它们，然后向后填充它们。

Any quick ways of achieving it?有什么快速实现的方法吗？ Thanking you in advance.提前谢谢你。

Answer 1

I believe you need first convert columns to floats if , there:我相信你需要首先将列转换为浮点数,如果有：

df = pd.read_csv(file, thousands=',')

Or:或者：

df['revenues'] = df['revenues'].replace(',','', regex=True).astype(float)

and then add DataFrame.interpolate :然后添加DataFrame.interpolate ：

def ffbf(x):
    return x.interpolate().ffill().bfill()

Python / Pandas：用顺序填充 NaN - 线性插值 --> ffill --> bfill

问题描述

1 个解决方案

解决方案1
3 已采纳 2020-11-26 09:13:07

Python / Pandas：用顺序填充 NaN - 线性插值 --&gt; ffill --&gt; bfill

问题描述

1 个解决方案

解决方案1 3 已采纳 2020-11-26 09:13:07

Python / Pandas：用顺序填充 NaN - 线性插值 --> ffill --> bfill

解决方案1
3 已采纳 2020-11-26 09:13:07