水果图像分类器 (Python)

Question

I am trying to code a fruit image classifier with python, try to classify 7 fruits.我正在尝试使用 python 编写水果图像分类器，尝试对 7 种水果进行分类。 I have 15077 images for train_set and 4204images for validation_set.我有 15077 个用于 train_set 的图像和 4204 个用于 validation_set 的图像。 I compiled the code for 10 epochs, the i got results like:我编译了 10 个 epoch 的代码，得到的结果如下：

Train on 15077 samples, validate on 4204 samples Epoch 1/10 15077/15077 [==============================] - 264s 17ms/step - loss: 1.0652 - accuracy: 0.5325 - val_loss: 0.3722 - val_accuracy: 0.8428 Epoch 2/10 15077/15077 [==============================] - 256s 17ms/step - loss: 0.4236 - accuracy: 0.8405 - val_loss: 0.2910 - val_accuracy: 0.9034 Epoch 3/10 15077/15077 [==============================] - 499s 33ms/step - loss: 0.2682 - accuracy: 0.9107 - val_loss: 0.3614 - val_accuracy: 0.8830 Epoch 4/10 15077/15077 [==============================] - 243s 16ms/step - loss: 0.2022 - accuracy: 0.9381 - val_loss: 0.0985 - val_accuracy: 0.9724 Epoch 5/10 15077/15077 [==============================] - 245s 16ms/step - loss: 0.1500 - accuracy: 0.9548 - val_loss: 0.1258 - val_accuracy: 0.9536 Epoch 6/10 15077/15077 [==============================] - 253s 17ms/step - loss: 0.1509 - accuracy: 0.9529 - val_loss: 0.1831 - val_accuracy: 0.9317 Epoch 7/10 15077/15077 [==============================] -训练 15077 个样本，验证 4204 个样本 Epoch 1/10 15077/15077 [==============================] - 264s 17ms/step - loss: 1.0652 - accuracy: 0.5325 - val_loss: 0.3722 - val_accuracy: 0.8428 Epoch 2/10 15077/15077 [======================= =======] - 256s 17ms/步 - 损失：0.4236 - 准确度：0.8405 - val_loss：0.2910 - val_accuracy：0.9034 Epoch 3/10 15077/15077 [============= =================] - 499s 33ms/步 - 损失：0.2682 - 准确度：0.9107 - val_loss：0.3614 - val_accuracy：0.8830 Epoch 4/10 15077/15077 [=== ===========================] - 243s 16ms/step - loss: 0.2022 - accuracy: 0.9381 - val_loss: 0.0985 - val_accuracy: 0.9724 Epoch 5/10 15077/15077 [===============================] - 245s 16ms/步 - 损失：0.1500 - 精度： 0.9548 - val_loss: 0.1258 - val_accuracy: 0.9536 Epoch 6/10 15077/15077 [=============================] - 253s 17ms/step - loss: 0.1509 - accuracy: 0.9529 - val_loss: 0.1831 - val_accuracy: 0.9317 Epoch 7/10 15077/15077 [======================= =======] - 245s 16ms/step - loss: 0.1020 - accuracy: 0.9678 - val_loss: 0.2164 - val_accuracy: 0.9391 Epoch 8/10 15077/15077 [==============================] - 255s 17ms/step - loss: 0.0668 - accuracy: 0.9816 - val_loss: 0.3004 - val_accuracy: 0.9229 Epoch 9/10 15077/15077 [==============================] - 243s 16ms/step - loss: 0.1081 - accuracy: 0.9704 - val_loss: 1.4997 - val_accuracy: 0.8639 Epoch 10/10 15077/15077 [==============================] - 240s 16ms/step - loss: 0.0765 - accuracy: 0.9784 - val_loss: 0.1763 - val_accuracy: 0.9424 Test loss: 0.17632227091173225 Test accuracy: 0.9424358010292053 i wonder that why accuracy is acting like sin wave? 245 秒 16 毫秒/步 - 损失：0.1020 - 准确度：0.9678 - val_loss：0.2164 - val_accuracy：0.9391 Epoch 8/10 15077/15077 [===================== ========] - 255 秒 17 毫秒/步 - 损失：0.0668 - 准确度：0.9816 - val_loss：0.3004 - val_accuracy：0.9229 纪元 9/10 15077/15077 [============ ==================] - 243s 16ms/步 - 损失：0.1081 - 准确度：0.9704 - val_loss：1.4997 - val_accuracy：0.8639 10/10 15077/15077 [== ============================] - 240s 16ms/step - loss: 0.0765 - accuracy: 0.9784 - val_loss: 0.1763 - val_accuracy: 0.9424测试损失：0.17632227091173225 测试精度：0.9424358010292053 我想知道为什么精度就像正弦波一样？ I think it should increasing every epochs.我认为它应该增加每个时代。 Do you have any recommendation for modifying code?您对修改代码有什么建议吗？ Thanks for replying.感谢回复。

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os
import cv2
import tensorflow as tf
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D
from keras.layers import LSTM, Input, TimeDistributed
from keras.models import Model
from keras.optimizers import RMSprop, SGD

# Import the backend
from keras import backend as K

dataDirTrain = "C:/Users/TCSEAKIN/Desktop/Py/AI-hack/AI/Training"
dataDirTest = "C:/Users/TCSEAKIN/Desktop/Py/AI-hack/AI/Test"
categories = ["Armut", "Cilek", "Elma_Kirmizi", "Elma_Yesil", "Mandalina", "Muz","Portakal"]


training_data = []
test_data = []

for category in categories:
    path = os.path.join(dataDirTrain, category)
    class_num = categories.index(category)
    for img in os.listdir(path):
        try:
            imgTrainArray = cv2.imread(os.path.join(path,img))
            newTrainArray = cv2.resize(imgTrainArray, (50, 50))
            training_data.append([newTrainArray, class_num])
        except Exception as e:
            pass


for category in categories:
    path = os.path.join(dataDirTest, category)
    class_num = categories.index(category)
    for img in os.listdir(path):
        try:
            imgTestArray = cv2.imread(os.path.join(path,img))
            newTestArray = cv2.resize(imgTestArray, (50, 50))
            test_data.append([newTestArray, class_num])
        except Exception as e:
            pass

X_train = []
x_test = []
y_train = []
y_test = []

for features, label in training_data:
    X_train.append(features)
    y_train.append(label)

for features, label in test_data:
    x_test.append(features)
    y_test.append(label)

X_train = np.array(X_train).reshape(-1, 50, 50, 3)
y_train = np.array(y_train).reshape(-1, 1)

x_test = np.array(x_test).reshape(-1, 50, 50, 3)
y_test = np.array(y_test).reshape(-1, 1)


X_train = X_train/255
x_test = x_test/255

from keras.utils import to_categorical

Y_train_one_hot = to_categorical(y_train)
Y_test_one_hot = to_categorical(y_test)

model_cnn = Sequential()
# First convolutional layer, note the specification of shape
model_cnn.add(Conv2D(64, kernel_size=(3, 3),activation='relu',input_shape=(50, 50, 3)))


model_cnn.add(Conv2D(128, (3, 3), activation='relu'))
model_cnn.add(MaxPooling2D(pool_size=(2, 2)))
model_cnn.add(Conv2D(256, (3, 3), activation='relu'))
model_cnn.add(MaxPooling2D(pool_size=(2, 2)))
model_cnn.add(Dropout(0.5))

model_cnn.add(Flatten())
model_cnn.add(Dense(128, activation='relu'))
model_cnn.add(Dropout(0.5))
model_cnn.add(Dense(64, activation='relu'))
model_cnn.add(Dropout(0.5))
model_cnn.add(Dense(7, activation='softmax'))

model_cnn.compile(loss="categorical_crossentropy",
              optimizer="adam",
              metrics=['accuracy'])

model_cnn.fit(X_train, Y_train_one_hot,
          batch_size=64,
          epochs=10,
          verbose=1,
          validation_data=(x_test, Y_test_one_hot))
score = model_cnn.evaluate(x_test, Y_test_one_hot, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

model_cnn.save("C:/Users/TCSEAKIN/Desktop/Training3.py"

Answer 1

I think training accuracy is already good.我认为训练的准确性已经很好了。 You can do couple of things to improve validation accuracy and overall performance.您可以做几件事来提高验证准确性和整体性能。

use ImageDataGenerator to augment images to get better model accuracies使用ImageDataGenerator增强图像以获得更好的 model 精度
use lower learning rate or adaptive learning rate.使用较低的学习率或自适应学习率。
Update two lines as shown below.如下所示更新两行。 Recently it was found that there are numerical instabilities introduced by softmax layer in the end of the model.最近发现model的末尾有softmax层引入的数值不稳定性。

Replace these two line替换这两行
from从

model_cnn.add(Dense(7, activation='softmax'))

model_cnn.compile(loss="categorical_crossentropy",
              optimizer="adam",
              metrics=['accuracy'])

to至

model_cnn.add(Dense(7))

model_cnn.compile(loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),
              optimizer="adam",
              metrics=['accuracy'])

Try tuning some hyper parameters like changing batch_size from 64 to 32 which increases number of steps, try other optimizations techniques, increasing number of epochs etc.尝试调整一些超参数，例如将 batch_size 从 64 更改为 32，这会增加步数，尝试其他优化技术，增加 epoch 数等。

Hope it helps.希望能帮助到你。

Answer 2

In fact it's a normal behavior.事实上，这是一种正常的行为。 You can reduce your learning rate in order to decrease differences in accuracy between training epochs (Also, try other optimizers if you want)您可以降低学习率以减少训练时期之间的准确性差异（另外，如果您愿意，请尝试其他优化器）

Also, you should normalize your images in order to improve the capacity of your network to generalize knowledge.此外，您应该规范化您的图像，以提高网络概括知识的能力。 Here you can read more about normalization: https://machinelearningmastery.com/how-to-normalize-center-and-standardize-images-with-the-imagedatagenerator-in-keras/在这里您可以阅读有关规范化的更多信息： https://machinelearningmastery.com/how-to-normalize-center-and-standardize-images-with-the-imagedatagenerator-in-keras/

水果图像分类器 (Python)

问题描述

2 个解决方案

解决方案1
2 已采纳 2020-05-04 22:22:34

解决方案2
0 2020-05-04 20:23:00

水果图像分类器 (Python)

问题描述

2 个解决方案

解决方案1 2 已采纳 2020-05-04 22:22:34

解决方案2 0 2020-05-04 20:23:00

解决方案1
2 已采纳 2020-05-04 22:22:34

解决方案2
0 2020-05-04 20:23:00