简体   繁体   English

如何使用pandas填充缺失值?

[英]How to fill missing values using pandas?

I'm trying to fill missing values with another array which is predicted by a regressor. 我正在尝试用另一个由回归量预测的数组填充缺失值。 I don't know how to replace the missing values with corresponding values in that array. 我不知道如何用该数组中的相应值替换缺少的值。

For example, I have: 例如,我有:

[0, 1, 2, NaN, NaN] 

and

[0, 0, 1, 2, 3]

How can I replace these NaN with 2 and 3? 如何用2和3替换这些NaN? It seems that fillna can't do this. 似乎fillna不能这样做。

Sorry for having asked an ambiguous question. 很抱歉提出了一个含糊不清的问题。

First you have to clearly identify the meaning of missing values (NaN, string, integer and even 0 can be represented as a missing value depending on your dataset) 首先,您必须清楚地识别缺失值的含义(NaN,字符串,整数甚至0可以表示为缺失值,具体取决于您的数据集)

The easiest way to do so if you have NaN value would be the following, you can always convert your missing value to nan by using replace as well. 如果你有NaN值,最简单的方法就是如下,你总是可以通过使用replace将你的缺失值转换为nan。

# let df be your dataframe and x be the value you want to fill it with
df.fillna(x)

The second way would be imputing values using a library from sklearn. 第二种方法是使用sklearn中的库来输入值。 I have add a simple code for using the impute function assuming your missing values are 'NaN' and the method you want to fill the data with is with the mean of the column. 我添加了一个简单的代码,用于使用impute函数,假设您的缺失值是'NaN',并且您要填充数据的方法是使用列的平均值。

from sklearn.impute import SimpleImputer
df = SimpleImputer(missing_value = np.nan, strategy = 'mean').fit_transform(df)

You can change the strategy to different method such as mean of the column, or the median or the column. 您可以将策略更改为不同的方法,例如列的平均值,或中值或列。 It all depends on what work best for you 这一切都取决于最适合你的工作

Suppose there are 2 arrays: 假设有2个数组:

arr1 = pd.DataFrame([0, 1, 2, np.NaN, np.NaN])
arr2 = pd.DataFrame([0, 0, 1, 2, 3])

You can replace NaN of arr1 with the corresponding element of arr2 via fillna : 您可以通过fillnaarr1 NaN替换为arr2的相应元素:

arr1.fillna(arr2, inplace=True)

This is the result after executing fillna : 这是执行fillna后的结果:

arr1 = [0, 1, 2, 2, 3]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM