I was just trying out for DataPreprocessing where I frequently get this error.Can anyone explain me what is wrong in this particular code for the given dataset?
Thanks in advance!
# STEP 1: IMPORTING THE LIBARIES
import numpy as np
import pandas as pd
# STEP 2: IMPORTING THE DATASET
dataset = pd.read_csv("https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/datasets/Data.csv", error_bad_lines=False)
X = dataset.iloc[:,:-1].values
Y = dataset.iloc[:,1:3].values
# STEP 3: HANDLING THE MISSING VALUES
from sklearn.preprocessing import Imputer
imputer = Imputer(missing_values = "NaN",strategy = "mean",axis = 0)
imputer = imputer.fit(X[ : , 1:3])
X[:,1:3] = imputer.transform(X[:,1:3])
# STEP 4: ENCODING CATEGPRICAL DATA
from sklearn.preprocessing import LaberEncoder,OneHotEncoder
labelencoder_X = LabelEncoder() # Encode labels with value between 0 and n_classes-1.
X[ : , 0] = labelencoder_X.fit_transform(X[ : , 0]) # All the rows and first columns
onehotencoder = OneHotEncoder(categorical_features = [0])
X = onehotencoder.fit_transform(X).toarray()
labelencoder_Y = LabelEncoder()
Y = labelencoder_Y.fit_transform(Y)
# Step 5: Splitting the datasets into training sets and Test sets
from sklearn.cross_validation import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split( X , Y , test_size = 0.2, random_state = 0)
# Step 6: Feature Scaling
from sklearn.preprocessing import StandardScaler
sc_X = StandardScaler()
X_train = sc_X.fit_transform(X_train)
X_test = sc_X.fit_transform(X_test)
Returns Error:
ValueError: Found array with 0 feature(s) (shape=(546, 0)) while a minimum of 1 is required.
Your link in this line
dataset = pd.read_csv("https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/datasets/Data.csv", error_bad_lines=False)
is wrong.
The current link returns the webpage on github where this csv is shown, but not the actual csv data. So whatever data is present in dataset
is invalid.
Change that to:
dataset = pd.read_csv("https://raw.githubusercontent.com/Avik-Jain/100-Days-Of-ML-Code/master/datasets/Data.csv", error_bad_lines=False)
Other than that, there is a spelling mistake in LabelEncoder
import.
Now even if you correct these, there will still be errors, because of
Y = labelencoder_Y.fit_transform(Y)
LabelEncoder only accepts a single column array as input, but your current Y
will be of 2 columns due to
Y = dataset.iloc[:,1:3].values
Please explain more clearly what do you want to do.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.