简体   繁体   中英

using Imputer in scikit-learn

I need to fill the missing temperature value with the mean value of that month using Imputer() in scikit-learn.

First I split the dataframe into groups based on the month. Then I called the imputer function to calculate the mean for that group and fill in the missing values.

Here is the code I wrote but it didn't work:

def impute_missing (data_1_group):
    imp = Imputer(missing_values='NaN', strategy='mean', axis=0)
    imp.fit(data_1_group)
    data_1_group=imp.transform(data_1_group['datetime'])
    return(data_1_group)

for data_1_group in data_1.groupby(pd.TimeGrouper("M")):
    impute_missing(data_1_group)

Any suggestion?

try this small change

imp=imp.fit(data_1_group['datetime']) data_1_group=imp.transform(data_1_group['datetime'])

Though I m new to scikit myself, I am recommending the solution that worked for me. This is because

1) imp object needs to override to fit, as in the first line

2) it needs to fit and impute the same dataset, which in this case seems to be data_1_group['datetime']

I hope this helps

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM