简体   繁体   English

Keras 图像大小调整时的图像分类预测错误

[英]Keras image classification prediction error on image resize

I have a trained model, which has been trained to recognize different documents, I got the dataset from http://www.cs.cmu.edu/~aharley/rvl-cdip/ .我有一个训练有素的 model,它经过训练可以识别不同的文档,我从http://www.cs.cmu.edu/~aharley/rvl-cdip/获得了数据集。

Below is how I built my model下面是我如何构建我的 model

import numpy as np
from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img
import pickle

from keras.optimizers import SGD
from keras.models import Sequential, save_model
from keras.layers import Dense, Dropout, Flatten, Activation
from keras.layers.convolutional import Conv2D, MaxPooling2D

# Set image information
channels = 1
height = 1000
width = 754

model = Sequential()
# Add a Conv2D layer with 32 nodes to the model
model.add(Conv2D(32, (3, 3), input_shape=(1000, 754, 3)))
# Add the reLU activation function to the model
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())  # this converts our 3D feature maps to 1D feature vectors
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('relu'))

model.compile(loss='categorical_crossentropy',  # sparse_categorical_crossentropy
              # Adam(lr=.0001) SGD variation with learning rate
              optimizer='adam',
              metrics=['accuracy'])

# Image data generator to import iamges from data folder
datagen = ImageDataGenerator()

# Flowing images from folders sorting by labels, and generates batches of images
train_it = datagen.flow_from_directory(
    "data/train/", batch_size=16, target_size=(height, width), shuffle=True, class_mode='categorical')
test_it = datagen.flow_from_directory(
    "data/test/", batch_size=16, target_size=(height, width), shuffle=True, class_mode='categorical')
val_it = datagen.flow_from_directory(
    "data/validate/", batch_size=16, target_size=(height, width), shuffle=True, class_mode='categorical')

history = model.fit(
    train_it,
    epochs=2,
    batch_size=16,
    validation_data=val_it,
    shuffle=True,
    steps_per_epoch=2000 // 16,
    validation_steps=800 // 16)


save_model(model, "./ComplexDocumentModel")
model.save("my_model", save_format='h5')

As in the last line, I saved my model in an h5 format.与最后一行一样,我将 model 保存为 h5 格式。

I am trying now to use that trained model to predict on a single image, to see on which category it belongs with the below script.我现在正在尝试使用经过训练的 model 来预测单个图像,以查看它属于哪个类别以及以下脚本。

from keras.models import load_model
import cv2
import numpy as np
import keras
from keras.preprocessing import image

model = load_model('my_model')

# First try
def prepare(file):
    img_array = cv2.imread(file, cv2.IMREAD_GRAYSCALE)
    new_array = cv2.resize(img_array, (1000, 754))
    return new_array.reshape(3, 1000, 754, 1)


# Second try
img = image.load_img(
    "/home/user1/Desktop/Office/image-process/test/0000113760.tif")
img = image.img_to_array(img)
img = np.expand_dims(img, axis=-1)


prediction = model.predict(
    [prepare("/home/user1/Desktop/Office/image-process/test/0000113760.tif")])

print(prediction)

I tried predicting the image in two ways, but both give the error我尝试以两种方式预测图像,但都给出了错误

    ValueError: Input 0 of layer sequential is incompatible with the layer: expected axis -1 of input shape to have value 3 but received input with shape (None, 762, 3, 1)

I have also tried opening the image with PIL and converting it to NumPy array, an approach found on google.我也试过用 PIL 打开图像并将其转换为 NumPy 数组,这是一种在谷歌上找到的方法。 Unfortunately no other answer, blog, or video tutorial that I found, helped me.不幸的是,我找到的其他答案、博客或视频教程都没有帮助我。

You are trying to feed a grayscale image to a.network that expects an image with 3 channels.您正在尝试将灰度图像提供给需要具有 3 个通道的图像的网络。 You can stack the last channel 3 times to have a compatible shape, but it is possible that the prediction will be poor:您可以将最后一个通道堆叠 3 次以获得兼容的形状,但预测可能会很差:

def prepare(file):
    img_array = cv2.imread(file, cv2.IMREAD_GRAYSCALE)
    new_array = cv2.resize(img_array, (1000, 754)) # shape is (1000,754)
    # converting to RGB
    array_color = cv2.cvtColor(new_array, cv2.COLOR_GRAY2RGB) # shape is (1000,754,3)
    array_with_batch_dim = np.expand_dims(array_color, axis=0) # shape is (1,1000,754,3)
    return array_with_batch_dim

Another solution is to not convert your image to grayscale when you read it, by omitting the flag cv2.IMREAD_GRAYSCALE .另一种解决方案是在读取图像时不将图像转换为灰度,方法是省略标志cv2.IMREAD_GRAYSCALE The default behaviour of opencv is to load an image with 3 channels. opencv 的默认行为是加载具有 3 个通道的图像。

def prepare(file):
    img_array = cv2.imread(file)
    new_array = cv2.resize(img_array, (1000, 754)) # shape is (1000,754, 3)
    # converting to RGB
    array_with_batch_dim = np.expand_dims(new_array, axis=0) # shape is (1,1000,754,3)
    return array_with_batch_dim

Note: Depending on your preprocessing, you might need to normalize your image between 0 and 1 by dividing it by 255 before feeding it to the.network.注意:根据您的预处理,您可能需要在将图像提供给 .network 之前将图像除以 255 以在 0 和 1 之间进行归一化。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM