Let me explain, i'm working with an Artificial Neural Network. This model has 15 variables, 14 independents and one dependent. In the independent variables i've 3 categorical variables (day of week, month, direction(north,south, etc...))
. I already enconde them (monday = 1, tuesday = 2, and so...),
also i hot encode them (monday = [1,0,0,0], tuesday = [0,1,0,0])
.
My question is, How can i make a prediction with new values, somethig like this.
X=['Monday','January','South']
Here is the code
# Classification template
# Importing the libraries
import numpy as np
import pandas as pd
# Importing the dataset
dataset = pd.read_csv('clean.csv')
X = dataset.iloc[:, [4,5,6,9,12,15,16]].values
y = dataset.iloc[:, 14].values
#Encoding categorical Data
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelenconder_X = LabelEncoder()
X[:,1] = labelenconder_X.fit_transform(X[:,1])
labelenconder_X_2 = LabelEncoder()
X[:,2] = labelenconder_X_2.fit_transform(X[:,2])
labelenconder_X_7 = LabelEncoder()
X[:,4] = labelenconder_X_7.fit_transform(X[:,4])
labelenconder_X_9 = LabelEncoder()
X[:,5] = labelenconder_X_9.fit_transform(X[:,5])
labelenconder_X_10 = LabelEncoder()
X[:,6] = labelenconder_X_10.fit_transform(X[:,6])
onehotencoder = OneHotEncoder(categorical_features=[1,2,4,5,6])
X = onehotencoder.fit_transform(X).toarray()
X = X[:, 1:]
# Splitting the dataset into the Training set and Test set
from sklearn.cross_validation import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)
# Feature Scaling
#from sklearn.preprocessing import StandardScaler
#sc = StandardScaler()
#X_train = sc.fit_transform(X_train)
#X_test = sc.transform(X_test)
# Fitting classifier to the Training set
# Create your classifier here
import keras
from keras.models import Sequential
from keras.layers import Dense
classifier = Sequential()
#INPUT LAYER AND HIDDEN LAYER
classifier.add(Dense(units = 5, kernel_initializer = 'uniform', activation = 'relu', input_dim =9))
#ADDING SECOND HIDDEN LAYER
classifier.add(Dense(units = 5, kernel_initializer = 'uniform', activation = 'relu'))
#adding output node
classifier.add(Dense(units= 1, kernel_initializer = 'uniform', activation = 'sigmoid'))
#Applygin Stochasting Gradient Descent
classifier.compile(optimizer='adam', loss = 'binary_crossentropy', metrics=['accuracy'])
classifier.fit(X_train, y_train, batch_size =28, epochs = 100)
classifier.save('ANN2.h5')
model = keras.models.load_model('ANN2.h5')
y_predict = model.predict(X_test)
y_predict = (y_predict > 0.40)
If you want to encode all days of the week for a prediction, monday should probably be [1,0,0,0,0,0,0]
. Or you use regression (0.0 - 6.0) instead of classification.
But, since you used X
instead of y
here, I'm not sure if your provided X=['Monday','January','South']
is meant to be the input rather than the output ( y
). If it is, you do not need a one-hot encoding and you can just encode as eg X=[0,0,2]
with
I agree with @morsecodist that more information is needed for a proper answer to your question.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.