如何使用 Tensorflow 輸入用戶圖像進行預測？

Question

對於我的項目，我使用 tensorflow 來預測手寫用戶輸入。

基本上我使用了這個數據集： https://www.kaggle.com/rishianand/devanagari-character-set ，並創建了一個 model。 我使用 matplotlib 來查看像素生成的圖像。

我的代碼基本上適用於訓練數據，但我想提高一點。 通過 CV2，我創建了一個允許用戶繪制尼泊爾字母的 GUI。 在此之后，我有一個分支，告訴程序將圖像保存在計算機中。

這是我的代碼片段：

#creating a forloop to show the image
while True:
    img=cv2.imshow('window', win) #showing the window
    k= cv2.waitKey(1) 
    if k==ord('c'):
        win= np.zeros((500,500,3), dtype='float64') #creating a new image
    #saving the image as a file to then resize it
    if k==ord('s'):
        cv2.imwrite("nepali_character.jpg", win)
        img= cv2.imread("nepali_character.jpg")
        cv2.imshow('char', img)
        #trying to resize the image using Pillow
        size=(32,32)
        #create a while loop(make the user print stuff until they print something that STOPS it)
        im= Image.open("nepali_character.jpg")
        out=im.resize(size)
        l= out.save('resized.jpg')
        imgout= cv2.imread('resized.jpg')
        cv2.imshow("out", imgout)
        #finding the pixels of the image, will be printed as a matrix
        pix= cv2.imread('resized.jpg', 1)
        print(pix)
    if k==ord('q'): #if k is 27 then we break the window
        cv2.destroyAllWindows()
        break

我調整了圖像的大小，因為這些是數據集中數據的維度。

現在我的問題是如何通過 tensorflow 預測該字母是什么。

我問老師，他說把它放在我的數據文件里，然后把它當作訓練數據，然后看權重，選擇最大的權重？

但我對 go 感到困惑我可以把這張圖片放到那個數據文件中嗎？

如果有人對如何獲取用戶輸入然后進行預測有任何建議，將不勝感激

Answer 1

了解數據集：

圖像大小為 32 x 32
有 46 個不同的字符/字母

['character_10_yna', 'character_11_taamatar', 'character_12_thaa', 'character_13_daa', 'character_14_dhaa', 'character_15_adna', 'character_16_tabala', 'character_17_tha', 'character_18_da', 'character_19_dha', 'character_1_ka', 'character_20_na', 'character_21_pa', 
'character_22_pha', 'character_23_ba', 'character_24_bha', 'character_25_ma',
 'character_26_yaw', 'character_27_ra', 'character_28_la', 'character_29_waw', 'character_2_kha', 'character_30_motosaw', 'character_31_petchiryakha', 'character_32_patalosaw', 'character_33_ha', 'character_34_chhya', 
'character_35_tra', 'character_36_gya', 'character_3_ga', 'character_4_gha', 'character_5_kna', 'character_6_cha', 'character_7_chha', 'character_8_ja', 
'character_9_jha', 'digit_0', 'digit_1', 'digit_2', 'digit_3', 'digit_4', 'digit_5', 'digit_6', 'digit_7', 'digit_8', 'digit_9']

由於您的圖像分類在一個文件夾中

所以 keras 實現將是：

import matplotlib.pyplot as plt
import numpy as np
import os
import PIL
import tensorflow as tf

from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
import pathlib
dataDir = "/xx/xx/xx/xx/datasets/Devanagari/drive-download-20210601T224146Z-001/Train"
data_dir = keras.utils.get_file(dataDir, 'file://'+dataDir)
data_dir = pathlib.Path(data_dir)
image_count = len(list(data_dir.glob('*/*.png')))
print(image_count)
batch_size = 32
img_height = 180 # scale it up for better performance
img_width = 180 # scale it up for better performance

train_ds = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="training",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)

val_ds = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="validation",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)
class_names = train_ds.class_names
print(class_names) # 46 classes

對於緩存和規范化，請參閱tensorflow 教程

AUTOTUNE = tf.data.experimental.AUTOTUNE

train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)
normalization_layer = layers.experimental.preprocessing.Rescaling(1./255)
normalized_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
image_batch, labels_batch = next(iter(normalized_ds))
first_image = image_batch[0]
print(np.min(first_image), np.max(first_image))

model setup編譯和訓練

num_classes = 46

model = Sequential([
  layers.experimental.preprocessing.Rescaling(1./255, input_shape=(img_height, img_width, 3)),
  layers.Conv2D(16, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(32, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(64, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Flatten(),
  layers.Dense(128, activation='relu'),
  layers.Dense(num_classes)
])

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

epochs=10
history = model.fit(
  train_ds,
  validation_data=val_ds,
  epochs=epochs
)

這將導致如下（非常有希望！）

Epoch 10/10
1955/1955 [==============================] - 924s 472ms/step - loss: 0.0201 - accuracy: 0.9932 - val_loss: 0.2267 - val_accuracy: 0.9504

保存 model （這需要時間來訓練，所以最好保存模型）

!mkdir -p saved_model
model.save('saved_model/my_model')

加載 model：

loaded_model = tf.keras.models.load_model('saved_model/my_model')
# Check its architecture
loaded_model.summary()

現在是最后的任務，得到預測。 一種方法如下：

import cv2
im2=cv2.imread('datasets/Devanagari/drive-download-20210601T224146Z-001/Test/character_3_ga/3711.png')
im2=cv2.resize(im2, (180,180)) # resize to 180,180 as that is on which model is trained on
print(im2.shape)
img2 = tf.expand_dims(im2, 0) # expand the dims means change shape from (180, 180, 3) to (1, 180, 180, 3)
print(img2.shape)

predictions = loaded_model.predict(img2)
score = tf.nn.softmax(predictions[0]) # # get softmax for each output

print(
    "This image most likely belongs to {} with a {:.2f} percent confidence."
    .format(class_names[np.argmax(score)], 100 * np.max(score))
) # get the np.argmax, means give me the index where probability is max, in this case it got 29. This answers the response 
# you got from your instructor. that is "greatest weight"

(180, 180, 3)
(1, 180, 180, 3)
This image most likely belongs to character_3_ga with a 100.00 percent confidence.

另一種方式是通過在線。 你想要達到的目標。 對於此示例，圖像形狀需要在 (1, 180, 180, 3) 中，如果沒有調整大小，則可以是 (1, 32, 32, 3)。 然后喂它來預測。 像下面的東西

out=im.resize(size)
out = tf.expand_dims(out, 0)
predictions = loaded_model.predict(out)
score = tf.nn.softmax(predictions[0]) # # get softmax for each output

print(
    "This image most likely belongs to {} with a {:.2f} percent confidence."
    .format(class_names[np.argmax(score)], 100 * np.max(score))
)

如何使用 Tensorflow 輸入用戶圖像進行預測？

問題描述

1 個解決方案

解決方案1
1 已采納 2021-06-03 03:15:43

如何使用 Tensorflow 輸入用戶圖像進行預測？

問題描述

1 個解決方案

解決方案1 1 已采納 2021-06-03 03:15:43

解決方案1
1 已采納 2021-06-03 03:15:43