[英]How to predict label for new input values using artificial neural network in python
I am new in machine learning.我是机器学习的新手。 I am making a Streamlit app for multiclass classification using artificial neural network.
我正在使用人工神经网络制作用于多类分类的 Streamlit 应用程序。 My question is about the ANN model, not about the Streamlit.
我的问题是关于 ANN model,而不是关于 Streamlit。 I know I can use MLPClassifier, but I want to build and train my own model.
我知道我可以使用 MLPClassifier,但我想构建和训练我自己的 model。 So, I used the following code to analyze the following data.
所以,我用下面的代码来分析下面的数据。
-
-
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import Dropout
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.metrics import plot_roc_curve, roc_auc_score, roc_curve
from sklearn.model_selection import cross_val_score, cross_validate
from sklearn.model_selection import GridSearchCV
df=pd.read_csv("./Churn_Modelling.csv")
#Drop Unwanted features
df.drop(columns=['Surname','RowNumber','CustomerId'],inplace=True)
df.head()
#Label Encoding of Categ features
df['Geography']=df['Geography'].map({'France':0,'Spain':1,'Germany':2})
df['Gender']=df['Gender'].map({'Male':0,'Female':1})
#Input & Output selection
X=df.drop('Exited',axis=1)
Y = df['Exited']
Y = df['Exited'].map({'yes':1, 'no':2, 'maybe':3})
#train test split
from sklearn.model_selection import train_test_split
X_train,X_test,Y_train,Y_test=train_test_split(X,Y,test_size=0.3,random_state=12,stratify=Y)
#scaling
from sklearn.preprocessing import StandardScaler
ss = StandardScaler()
X_train = ss.fit_transform(X_train)
Y_train = ss.fit_transform(Y_train)
X_test=ss.transform(X_test)
# build a model
#build ANN
model=Sequential()
model.add(Dense(units=30,activation='relu',input_shape=(X.shape[1],)))
model.add(Dropout(rate = 0.2))
model.add(Dense(units=18,activation='relu'))
model.add(Dropout(rate = 0.1))
model.add(Dense(units=1,activation='sigmoid'))
model.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
#create callback : -
cb=EarlyStopping(
monitor="val_loss", #val_loss means testing error
min_delta=0.00001, #value of lambda
patience=15,
verbose=1,
mode="auto", #minimize loss #maximize accuracy
baseline=None,
restore_best_weights=False
)
trained_model=model.fit(X_train,Y_train,epochs=10,
validation_data=(X_test,Y_test),
callbacks=cb,
batch_size=10
)
model.evaluate(X_train,Y_train)
print("Training accuracy :",model.evaluate(X_train,Y_train)[1])
print("Training loss :",model.evaluate(X_train,Y_train)[0])
model.evaluate(X_test,Y_test)
print("Testing accuracy :",model.evaluate(X_test,Y_test)[1])
print("Testing loss :",model.evaluate(X_test,Y_test)[0])
y_pred_prob=model.predict(X_test)
y_pred=np.argmax(y_pred_cv, axis=-1)
print(classification_report(Y_test,y_pred))
print(confusion_matrix(Y_test,y_pred))
plt.figure(figsize=(7,5))
sns.heatmap(confusion_matrix(Y_test,y_pred),annot=True,cmap="OrRd_r",
fmt="d",cbar=True,
annot_kws={"fontsize":15})
plt.xlabel("Actual Result")
plt.ylabel("Predicted Result")
plt.show()
Then, I will save the model either by using pickle as follows-然后,我将使用 pickle 保存 model,如下所示 -
# pickle_out = open("./my_model.pkl", mode = "wb")
# pickle.dump(my_model, pickle_out)
# pickle_out.close()
or as follows-或如下 -
model.save('./my_model.h5')
Now, I want to predict the label (ie 'yes', 'no', 'maybe' etc.) of output variable 'Existed' based on new input values (as shown in the following table) that will be provided by an user -现在,我想根据用户提供的新输入值(如下表所示)预测 output 变量“已存在”的 label(即“是”、“否”、“可能”等) -
.
.
My question is that how should I save and load the model followed by predicting the labels for 'Existed' variable, so that it will automatically fill up the empty cell of Exited column with respective labels (ie 'yes', 'no', 'maybe' etc.).我的问题是我应该如何保存和加载 model 然后预测“现有”变量的标签,以便它会自动用相应的标签填充退出列的空单元格(即“是”、“否”、“也许'等)。
I will appreciate your insightful comments on this post.我将感谢您对这篇文章的富有洞察力的评论。
Once you have your model trained, you can simply run model.predict
with the data you wish to predict on.训练完 model 后,您可以简单地运行
model.predict
并使用您希望预测的数据。 Tricky parts of this process involve making sure this data is the right shape and that the indices match up.这个过程的棘手部分涉及确保这些数据是正确的形状并且索引匹配。
I typically use this recipe:我通常使用这个食谱:
Note that the features need to be in the exact same shape and order that the model was trained with.请注意,这些特征需要与训练 model 的形状和顺序完全相同。
to_predict = df[features]
predictions = model.predict(
to_predict.to_numpy().reshape(-1, len(features))
)
predictions
should be the same length as to_predict
and it will be an np.array
. predictions
的长度应该与to_predict
相同,它将是一个np.array
。 You can get this back into a DataFrame
with the same indices as to_predict
by using您可以通过使用将其返回到具有与
DataFrame
相同索引的to_predict
predictions = pd.DataFrame(
predictions,
columns="predicted_value", # Anything you want
index=to_predict.index,
)
In your case, this should give values of 0, 1, 2. You will need to map these values back to 'yes', 'no', 'maybe'.在您的情况下,这应该给出 0、1、2 的值。您需要将 map 这些值恢复为“是”、“否”、“可能”。 To avoid overcomplicating things, you can just use a
map
on this new DataFrame
:为避免过于复杂,您可以在这个新的
map
上使用DataFrame
:
predictions["predicted_value"] = predictions["predicted_value"].map({0: 'yes', 1: 'no', 2: 'maybe'})
Now we need to merge these predictions back with the original df
:现在我们需要将这些预测与原始的
df
合并回来:
df = df.merge(
predictions, left_index=True, right_index=True, how="outer"
)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.