简体   繁体   English

用单独的Pan​​das系列中的值替换NaN值(在Pandas DataFrame中)的最佳方法是什么?

[英]What's the best way to replace NaN values (in a Pandas DataFrame) with values from a separate Pandas Series?

I started with a Pandas DataFrame which has a column with many NaN values. 我从一个Pandas DataFrame开始,它的列包含许多NaN值。

I split this Pandas DataFrame into two DataFrames: non-NaN and NaN. 我将此熊猫数据框分为两个数据框:非NaN和NaN。

I estimated a linear regression model to try to fill in the NaN values (as a function of the other columns). 我估计了线性回归模型以尝试填写NaN值(作为其他列的函数)。

So I now have a separate Pandas Series that has the estimated values. 因此,我现在有一个单独的熊猫系列,其中包含估计值。 Its length is the same length as the NaN DataFrame. 它的长度与NaN DataFrame的长度相同。

I now want to put these estimated values back into the NaN DataFrame, so that I can then ultimately pd.concat() these two DataFrames into one DataFrame that I can then use for my analysis. 现在,我想将这些估计值放回NaN DataFrame中,以便最终将pd.concat()这两个Dataframe放入一个DataFrame中,然后将其用于分析。

I cannot figure out a way to put these values back into the NaN DataFrame into the correct rows. 我无法找出一种方法将这些值放回NaN DataFrame中正确的行中。 Every time I tried, only some of the NaNs get filled (and probably in the wrong order). 每次尝试时,只会填充一些NaN(并且顺序可能错误)。 It seems to be something to do with the way they're indexed. 似乎与它们的索引方式有关。

df_nan["Column"] = y_predicted

This is the way I've tried to do it, but it only fills in some of the rows, and incorrectly. 这是我尝试执行的方法,但它仅填充了某些行,而且是错误地。 Something to do with indices maybe? 可能与索引有关?

I think a way of doing this could be the following: you keep your raw dataframe and use apply on the column you want to impute. 我认为可以通过以下方式实现:保留原始数据框,并在要插入的列上使用Apply。

df['imputed_column'] = df.apply(lambda x: x.Column if(pd.notnull(x.Column)) else y_predicted[x.name],axis=1)

The following line will get the estimated value if it has a null value (with x.name being the index of the row). 如果下一行具有空值(x.name是该行的索引),则将获得估计值。 Otherwise, it will keep the same value. 否则,它将保持相同的值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM