极高的损失和一致的验证准确性

Question

This is a question from Coursera.这是来自 Coursera 的一个问题。 Everything output as expected expect for the training part. output 的所有内容都符合训练部分的预期。 I have tried different layers but they were the same.我尝试了不同的层，但它们是相同的。 Maybe some mistakes in my manipulation of the dataset?也许我在处理数据集时出现了一些错误？

I couldn't found it,can somebody help?没找到，有大神帮忙吗？ Thanks谢谢

import csv
import numpy as np
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from os import getcwd

def get_data(filename):
  # You will need to write code that will read the file passed
  # into this function. The first line contains the column headers
  # so you should ignore it
  # Each successive line contians 785 comma separated values between 0 and 255
  # The first value is the label
  # The rest are the pixel values for that picture
  # The function will return 2 np.array types. One with all the labels
  # One with all the images
  #
  # Tips: 
  # If you read a full line (as 'row') then row[0] has the label
  # and row[1:785] has the 784 pixel values
  # Take a look at np.array_split to turn the 784 pixels into 28x28
  # You are reading in strings, but need the values to be floats
  # Check out np.array().astype for a conversion
    with open(filename) as training_file:
        
      # Your code starts here
      reader = csv.reader(training_file)
      next(reader,None)
      images = []
      labels = []
      for i in reader:
            
            labels.append(i[0])
            imageData = i[1:785]
            images.append(np.array_split(imageData,28))
            
      # Your code ends here
      labels = np.array(labels).astype('float')
      images = np.array(images).astype('float')
    return images, labels

path_sign_mnist_train = f"{getcwd()}/../tmp2/sign_mnist_train.csv"
path_sign_mnist_test = f"{getcwd()}/../tmp2/sign_mnist_test.csv"
training_images, training_labels = get_data(path_sign_mnist_train)
testing_images, testing_labels = get_data(path_sign_mnist_test)

# Keep these
print(training_images.shape)
print(training_labels.shape)
print(testing_images.shape)
print(testing_labels.shape)

# In this section you will have to add another dimension to the data
# So, for example, if your array is (10000, 28, 28)
# You will need to make it (10000, 28, 28, 1)

training_images = np.expand_dims(training_images,axis=-1)# Your Code Here
testing_images = np.expand_dims(testing_images,axis=-1)# Your Code Here

# Create an ImageDataGenerator and do Image Augmentation
train_datagen = ImageDataGenerator(rescale = 1./255.,
                                   rotation_range = 40,
                                   width_shift_range = 0.2,
                                   height_shift_range = 0.2,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True,
                                   fill_mode = 'nearest'
    )

validation_datagen = ImageDataGenerator(rescale = 1./255.)
    
# Keep These
print(training_images.shape)
print(testing_images.shape)
    
# Their output should be:
# (27455, 28, 28, 1)
# (7172, 28, 28, 1)

# Define the model
# Use no more than 2 Conv2D and 2 MaxPooling2D
from tensorflow.keras.optimizers import RMSprop
model = tf.keras.models.Sequential([    tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(28, 28, 1)),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512, activation='relu'),
    tf.keras.layers.Dense(26, activation='softmax')])


# Compile Model. 
model.compile(loss = 'sparse_categorical_crossentropy',
              optimizer = RMSprop(lr=0.01),
              metrics = ['accuracy'])

# Train the Model
train_generator = train_datagen.flow(training_images,training_labels,
                                                    batch_size = 10
                                                     
                                                  )  
validation_generator =  validation_datagen.flow( testing_images,
                                                
                                                testing_labels,
                                                batch_size  = 10  
                                                         )
history = model.fit_generator(train_generator,
                              epochs=5,
                              steps_per_epoch=len(training_images) / 32,
                              validation_data=validation_generator
                              
                             )

model.evaluate(testing_images, testing_labels,verbose=0)

The output of the model is shown as below: model的output如下图所示：

Epoch 1/5
858/857 [==============================] - 78s 91ms/step - loss: 15.4250 - accuracy: 0.0422 - val_loss: 15.5210 - val_accuracy: 0.0371
Epoch 2/5
858/857 [==============================] - 75s 88ms/step - loss: 15.4719 - accuracy: 0.0401 - val_loss: 15.5210 - val_accuracy: 0.0371
Epoch 3/5
858/857 [==============================] - 77s 89ms/step - loss: 15.4230 - accuracy: 0.0431 - val_loss: 15.5210 - val_accuracy: 0.0371
Epoch 4/5
858/857 [==============================] - 76s 89ms/step - loss: 15.4268 - accuracy: 0.0429 - val_loss: 15.5120 - val_accuracy: 0.0371
Epoch 5/5
858/857 [==============================] - 75s 88ms/step - loss: 15.4287 - accuracy: 0.0428 - val_loss: 15.5120 - val_accuracy: 0.0371

The batch size is low because the Jupyter notebook from Coursera has it limited to 10.批量大小很小，因为 Coursera 的 Jupyter notebook 将其限制为 10。

Answer 1

Your code is correct.你的代码是正确的。 I suspect it has something to do with the optimizer.我怀疑它与优化器有关。 Try using Adam instead of RMSProp and try setting the learning rate for Adam to 0.001 which is the default learning rate.尝试使用 Adam 而不是 RMSProp，并尝试将 Adam 的学习率设置为默认学习率 0.001。 Other than that, your notebook is correctly extracting out the labels and data, formulating the data generators and the network appears correct.除此之外，您的笔记本正确地提取了标签和数据，制定了数据生成器并且网络看起来是正确的。

极高的损失和一致的验证准确性

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-08-20 21:54:19

极高的损失和一致的验证准确性

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-08-20 21:54:19

解决方案1
0 已采纳 2020-08-20 21:54:19