[英]Python / Pandas: Fill NaN with order - linear interpolation --> ffill --> bfill
I have a df
:我有一个
df
:
company year revenues
0 company 1 2019 1,425,000,000
1 company 1 2018 1,576,000,000
2 company 1 2017 1,615,000,000
3 company 1 2016 1,498,000,000
4 company 1 2015 1,569,000,000
5 company 2 2019 nan
6 company 2 2018 1,061,757,075
7 company 2 2017 nan
8 company 2 2016 573,414,893
9 company 2 2015 599,402,347
I would like to fill
the nan
values, with an order.我想用订单
fill
nan
值。 I want to linearly interpolate first, then forward fill and then backward fill.我想先线性插值,然后是前向填充,然后是后向填充。 I currently have:
我目前有:
f_2_impute = [x for x in cl_data.columns if cl_data[x].dtypes != 'O' and 'total' not in x and 'year' not in x]
def ffbf(x):
return x.ffill().bfill()
group_with = ['company']
for x in cl_data[f_2_impute]:
cl_data[x] = cl_data.groupby(group_with)[x].apply(lambda fill_it: ffbf(fill_it))
which performs ffill()
and bfill()
.它执行
ffill()
和bfill()
。 Ideally I want a function that tries first to linearly intepolate the missing values, then try forward filling them and then backward filling them.理想情况下,我想要一个 function 尝试首先线性插入缺失值,然后尝试向前填充它们,然后向后填充它们。
Any quick ways of achieving it?有什么快速实现的方法吗? Thanking you in advance.
提前谢谢你。
I believe you need first convert columns to floats if ,
there:我相信你需要首先将列转换为浮点数
,
如果有:
df = pd.read_csv(file, thousands=',')
Or:或者:
df['revenues'] = df['revenues'].replace(',','', regex=True).astype(float)
and then add DataFrame.interpolate
:然后添加
DataFrame.interpolate
:
def ffbf(x):
return x.interpolate().ffill().bfill()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.