简体   繁体   中英

In sklearn.preprocessing module I get ValueError: Found array with 0 feature(s)

I saw a bunch of questions have this error but I could not understand the relation with my code or problem.

I am trying to fix the NaN values in the data which I got from a sample CSV file that I found on the internet. My code is very simple actually:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

# Importing stuff.
from sklearn.preprocessing import Imputer
import pandas

# Loading the corrupt data
corrupt_data = pandas.read_csv('SampleCorruptData.csv')

#Creating Imputer object
imputer = Imputer(missing_values = 'NaN', strategy= "mean", axis = 0)

owner_id = corrupt_data.iloc[:,2:]

print(owner_id)

imputer = imputer.fit(owner_id.iloc[:,2:])

owner_id.iloc[:,2:] = imputer.transform(owner_id[:,2:])

print(owner_id)

The CSV file:

GroupName,Groupcode,GroupOwner
System Administrators,sysadmin,13456
Independence High Teachers,HS Teachers,
John Glenn Middle Teachers,MS Teachers,13458
Liberty Elementary Teachers,Elem Teachers,13559
1st Grade Teachers,1stgrade,NaN
2nd Grade Teachers,2nsgrade,13561
3rd Grade Teachers,3rdgrade,13562
Guidance Department,guidance,NaN
Independence Math Teachers,HS Math,13660
Independence English Teachers,HS English,13661
John Glenn 8th Grade Teachers,8thgrade,
John Glenn 7th Grade Teachers,7thgrade,13452
Elementary Parents,Elem Parents,NaN
Middle School Parents,MS Parents,18001
High School Parents,HS Parents,18002

As you can see the NaN values.

The Error I get:

Traceback (most recent call last):

  File "<ipython-input-21-1bfc8eb216cc>", line 1, in <module>
    runfile('/home/teoman/Desktop/data science/Fix Corrupt Data/imputation.py', wdir='/home/teoman/Desktop/data science/Fix Corrupt Data')

  File "/usr/lib/python3/dist-packages/spyder/utils/site/sitecustomize.py", line 866, in runfile
    execfile(filename, namespace)

  File "/usr/lib/python3/dist-packages/spyder/utils/site/sitecustomize.py", line 102, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "/home/teoman/Desktop/data science/Fix Corrupt Data/imputation.py", line 18, in <module>
    imputer = imputer.fit(owner_id.iloc[:,2:])

  File "/home/teoman/.local/lib/python3.5/site-packages/sklearn/preprocessing/imputation.py", line 155, in fit
    force_all_finite=False)

  File "/home/teoman/.local/lib/python3.5/site-packages/sklearn/utils/validation.py", line 470, in check_array
    context))

ValueError: Found array with 0 feature(s) (shape=(15, 0)) while a minimum of 1 is required.

What do I do wrong here?

If we trace your error, we can find the solution

Your error is:

ValueError: Found array with 0 feature(s) (shape=(15, 0)) while a minimum of 1 is required.

Basically it is looking for at least 1 feature. If we look at the docs of imputer : Parameters: X : numpy array of shape [n_samples, n_features ]

In your case you have 15 n_samples and 0 n_features If you transform your data and make n_features > 0 , your problem will be solved.

Keep in mined 1D numpy array returns 0 columns, if you reshape it with numpy.reshape() function or convert it to pd.DataFrame you can get 1 n_features.

I hope this helps

Thank you

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM