简体   繁体   中英

Tensorflow Keras loss is NaN

as you can see below i try to create an MLP with tensorflow/keras. But unfortunately the loss is always NaN when fitting. Do you have any advice?

as a second error message i get the message "'Functional' object has no attribute 'score'" when trying to measure accuracy with model.score , but i think this is a problem that is triggered by the first one.

thanks to all

import numpy as np
import matplotlib.pyplot as plt

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.decomposition import PCA
from mpl_toolkits import mplot3d
from sklearn import datasets
from various import printShapes, printNumpy, print_Model_Accuracy, printLARGE, checkFormat
from sklearn.datasets import make_blobs
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

np.random.seed(1234)
#%matplotlib qt 
#%matplotlib inline
plt.rcParams["figure.figsize"] = [4*2, 4*2]

if 0:
    iris = datasets.load_iris()
    X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.80, random_state=1234)

if 1:
    X, y = make_blobs(n_features=4, centers=3, n_samples=1000, cluster_std = 5.0,  random_state=1234)
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=1234)
    
print ("Target Label Example: y_train[0]:")
print (y_train[0])
print (type(y_train[0]))

printLARGE("MLP classifier TENSORFLOW")

tf.random.set_seed(1234)

Epochs = 10

inputs = keras.Input(shape=(4,), name="digits")
x = layers.Dense(100, activation="tanh", name="dense_1")(inputs)
x = layers.Dense(4, activation="tanh", name="dense_2")(x)
outputs = layers.Dense(3, activation="softmax", name="predictions")(x)

model = keras.Model(inputs=inputs, outputs=outputs)

model.compile(
    optimizer=keras.optimizers.RMSprop(),  # Optimizer
    loss=keras.losses.SparseCategoricalCrossentropy(), # Loss function to minimize
    metrics=[keras.metrics.SparseCategoricalAccuracy()],  # List of metrics to monitor
)
printShapes(X_train, "X_train", y_train, "y_train")
# TRAINING      
model.fit(X_train, y_train, batch_size=64, epochs=Epochs)
printShapes(X_test, "X_test", y_test, "y_test")
# INFERENCE
y_test_predproba = model.predict(X_test)
print(y_test_predproba)
y_test_pred = np.argmax(y_test_predproba, axis = 1)
print(y_test_pred)

print_Model_Accuracy(model, X_test, y_test, y_test_pred)
  1. Using tanh activation function in the hidden layers does not make any sense. It should be ReLU.
  2. Using one more hidden layer will be better than using more units in the first layer. [for your task]
  3. However, using more hidden layers makes the model more vulnerable to over-fitting, adding Dropout layers solves the issue.

Finally, your model should be,

inputs = keras.Input(shape=(4,), name="digits")
x = layers.Dense(32, activation="relu", name="dense_1")(inputs)
x = layers.Dropout(0.2)(x)
x = layers.Dense(24, activation="relu", name="dense_2")(x)
x = layers.Dropout(0.2)(x)
x = layers.Dense(16, activation="relu", name="dense_2")(x)
outputs = layers.Dense(3, activation="softmax", name="predictions")(x)

model = keras.Model(inputs=inputs, outputs=outputs)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM