I am working on a neural network that predicts heart disease. The data comes from kaggle and has been pre-processed. I have used various models, such as logistic regression, random forests, and SVM, which all produce solid results. I'm trying to use the same data for a neural network, to see whether a NN can outperform the other ML models (the data set is rather small, which may explain the poor results). Below is my code for the network. The model below produces 50% accuracy, which, obviously, is too low to be useful. From what you can tell, does anything look off that would undermine the accuracy of the model?
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from tensorflow.keras.layers import Dense, Dropout
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.callbacks import EarlyStopping
df = pd.read_csv(r"C:\Users\***\Desktop\heart.csv")
X = df[['age','sex','cp','trestbps','chol','fbs','restecg','thalach']].values
y = df['target'].values
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30)
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaler.fit_transform(X_train)
scaler.transform(X_test)
nn = tf.keras.Sequential()
nn.add(Dense(30, activation='relu'))
nn.add(Dropout(0.2))
nn.add(Dense(15, activation='relu'))
nn.add(Dropout(0.2))
nn.add(Dense(1, activation='sigmoid'))
nn.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics=
['accuracy'])
early_stop = EarlyStopping(monitor='val_loss',mode='min', verbose=1,
patience=25)
nn.fit(X_train, y_train, epochs = 1000, validation_data=(X_test, y_test),
callbacks=[early_stop])
model_loss = pd.DataFrame(nn.history.history)
model_loss.plot()
predictions = nn.predict_classes(X_test)
from sklearn.metrics import classification_report,confusion_matrix
print(classification_report(y_test,predictions))
print(confusion_matrix(y_test,predictions))
After running your model using EarlyStopping,
Epoch 324/1000
23/23 [==============================] - 0s 3ms/step - loss: 0.5051 - accuracy: 0.7364 - val_loss: 0.4402 - val_accuracy: 0.8182
Epoch 325/1000
23/23 [==============================] - 0s 3ms/step - loss: 0.4716 - accuracy: 0.7643 - val_loss: 0.4366 - val_accuracy: 0.7922
Epoch 00325: early stopping
WARNING:tensorflow:From <ipython-input-54-2ee8517852a8>:54: Sequential.predict_classes (from tensorflow.python.keras.engine.sequential) is deprecated and will be removed after 2021-01-01.
Instructions for updating:
Please use instead:* `np.argmax(model.predict(x), axis=-1)`, if your model does multi-class classification (e.g. if it uses a `softmax` last-layer activation).* `(model.predict(x) > 0.5).astype("int32")`, if your model does binary classification (e.g. if it uses a `sigmoid` last-layer activation).
precision recall f1-score support
0 0.90 0.66 0.76 154
1 0.73 0.93 0.82 154
accuracy 0.79 308
macro avg 0.82 0.79 0.79 308
weighted avg 0.82 0.79 0.79 308
It suggests a reasonable accuracy and f1-score with such a simple MLP.
I used this dataset: https://www.kaggle.com/abdulhakimrony/heartcsv/data
Train for all the epochs, the initial accuracy may be low but the model will soon converge after few epochs.
Use seed
in random, tensorflow and numpy to get reproducible result each time.
If simple models show good accuracy, the chances are NN will outperform, but you have to make sure the NN is not overfitted.
Check if your data is imbalanced or not, if yes, try using class_weights
.
You can try tuner
with cross-validation to get the best performing model.
The scaler is not in-place; you need to save the scaled results.
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
You'll then get results more in line with what you were expecting.
precision recall f1-score support
0 0.93 0.98 0.95 144
1 0.98 0.93 0.96 164
accuracy 0.95 308
macro avg 0.95 0.96 0.95 308
weighted avg 0.96 0.95 0.95 308
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.