繁体   English   中英

极高的损失和一致的验证准确性

[英]Extremely high loss with consistent validation accuracy

这是来自 Coursera 的一个问题。 output 的所有内容都符合训练部分的预期。 我尝试了不同的层,但它们是相同的。 也许我在处理数据集时出现了一些错误?

没找到,有大神帮忙吗? 谢谢

import csv
import numpy as np
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from os import getcwd

def get_data(filename):
  # You will need to write code that will read the file passed
  # into this function. The first line contains the column headers
  # so you should ignore it
  # Each successive line contians 785 comma separated values between 0 and 255
  # The first value is the label
  # The rest are the pixel values for that picture
  # The function will return 2 np.array types. One with all the labels
  # One with all the images
  #
  # Tips: 
  # If you read a full line (as 'row') then row[0] has the label
  # and row[1:785] has the 784 pixel values
  # Take a look at np.array_split to turn the 784 pixels into 28x28
  # You are reading in strings, but need the values to be floats
  # Check out np.array().astype for a conversion
    with open(filename) as training_file:
        
      # Your code starts here
      reader = csv.reader(training_file)
      next(reader,None)
      images = []
      labels = []
      for i in reader:
            
            labels.append(i[0])
            imageData = i[1:785]
            images.append(np.array_split(imageData,28))
            
      # Your code ends here
      labels = np.array(labels).astype('float')
      images = np.array(images).astype('float')
    return images, labels

path_sign_mnist_train = f"{getcwd()}/../tmp2/sign_mnist_train.csv"
path_sign_mnist_test = f"{getcwd()}/../tmp2/sign_mnist_test.csv"
training_images, training_labels = get_data(path_sign_mnist_train)
testing_images, testing_labels = get_data(path_sign_mnist_test)

# Keep these
print(training_images.shape)
print(training_labels.shape)
print(testing_images.shape)
print(testing_labels.shape)

# In this section you will have to add another dimension to the data
# So, for example, if your array is (10000, 28, 28)
# You will need to make it (10000, 28, 28, 1)

training_images = np.expand_dims(training_images,axis=-1)# Your Code Here
testing_images = np.expand_dims(testing_images,axis=-1)# Your Code Here

# Create an ImageDataGenerator and do Image Augmentation
train_datagen = ImageDataGenerator(rescale = 1./255.,
                                   rotation_range = 40,
                                   width_shift_range = 0.2,
                                   height_shift_range = 0.2,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True,
                                   fill_mode = 'nearest'
    )

validation_datagen = ImageDataGenerator(rescale = 1./255.)
    
# Keep These
print(training_images.shape)
print(testing_images.shape)
    
# Their output should be:
# (27455, 28, 28, 1)
# (7172, 28, 28, 1)

# Define the model
# Use no more than 2 Conv2D and 2 MaxPooling2D
from tensorflow.keras.optimizers import RMSprop
model = tf.keras.models.Sequential([    tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(28, 28, 1)),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512, activation='relu'),
    tf.keras.layers.Dense(26, activation='softmax')])


# Compile Model. 
model.compile(loss = 'sparse_categorical_crossentropy',
              optimizer = RMSprop(lr=0.01),
              metrics = ['accuracy'])

# Train the Model
train_generator = train_datagen.flow(training_images,training_labels,
                                                    batch_size = 10
                                                     
                                                  )  
validation_generator =  validation_datagen.flow( testing_images,
                                                
                                                testing_labels,
                                                batch_size  = 10  
                                                         )
history = model.fit_generator(train_generator,
                              epochs=5,
                              steps_per_epoch=len(training_images) / 32,
                              validation_data=validation_generator
                              
                             )

model.evaluate(testing_images, testing_labels,verbose=0)

model的output如下图所示:

Epoch 1/5
858/857 [==============================] - 78s 91ms/step - loss: 15.4250 - accuracy: 0.0422 - val_loss: 15.5210 - val_accuracy: 0.0371
Epoch 2/5
858/857 [==============================] - 75s 88ms/step - loss: 15.4719 - accuracy: 0.0401 - val_loss: 15.5210 - val_accuracy: 0.0371
Epoch 3/5
858/857 [==============================] - 77s 89ms/step - loss: 15.4230 - accuracy: 0.0431 - val_loss: 15.5210 - val_accuracy: 0.0371
Epoch 4/5
858/857 [==============================] - 76s 89ms/step - loss: 15.4268 - accuracy: 0.0429 - val_loss: 15.5120 - val_accuracy: 0.0371
Epoch 5/5
858/857 [==============================] - 75s 88ms/step - loss: 15.4287 - accuracy: 0.0428 - val_loss: 15.5120 - val_accuracy: 0.0371

批量大小很小,因为 Coursera 的 Jupyter notebook 将其限制为 10。

你的代码是正确的。 我怀疑它与优化器有关。 尝试使用 Adam 而不是 RMSProp,并尝试将 Adam 的学习率设置为默认学习率 0.001。 除此之外,您的笔记本正确地提取了标签和数据,制定了数据生成器并且网络看起来是正确的。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM