簡體   English   中英

如何在訓練/測試期間進行一次熱編碼后以及在 keras 中進行預測后查看類標簽

[英]How to view class labels after one hot encoding during training/testing and after the prediction in keras

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.applications import VGG16
from tensorflow.keras.layers import AveragePooling2D
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Input
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.utils import to_categorical
from sklearn.preprocessing import LabelBinarizer
from sklearn.model_selection import train_test_split
from keras.applications.vgg16 import decode_predictions
from imutils import paths
from pathlib import path
import numpy as np
import argparse
import cv2
import os


imagePaths = list(paths.list_images('D:/keras-cat-dog/dataset'))
data = []
labels = []


for imagePath in imagePaths:
  # extract the class label from the filename
    label = imagePath.split(os.path.sep)[-2]

# load the image, swap color channels, and resize it to be a fixed
# 224x224 pixels while ignoring aspect ratio
image = cv2.imread(imagePath)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image = cv2.resize(image, (224, 224))

# update the data and labels lists, respectively
data.append(image)
labels.append(label)


# convert the data and labels to NumPy arrays while scaling the pixel
# intensities to the range [0, 255]
data = np.array(data) / 255.0
labels = np.array(labels)


# perform one-hot encoding on the labels
lb = LabelBinarizer()
labels = lb.fit_transform(labels)
labels = to_categorical(labels)

# partition the data into training and testing splits using 80% of
# the data for training and the remaining 20% for testing
(trainX, testX, trainY, testY) = train_test_split(data, labels,
test_size=0.20, stratify=labels, random_state=42)


# initialize the training data augmentation object
trainAug = ImageDataGenerator(
rotation_range=15,
fill_mode="nearest")


# load the VGG16 network, ensuring the head FC layer sets are left
# off
baseModel = VGG16(weights="imagenet", include_top=False,
input_tensor=Input(shape=(224, 224, 3)))


# construct the head of the model that will be placed on top of the
# the base model
headModel = baseModel.output
headModel = AveragePooling2D(pool_size=(4, 4))(headModel)
headModel = Flatten(name="flatten")(headModel)
headModel = Dense(64, activation="relu")(headModel)
headModel = Dropout(0.5)(headModel)
headModel = Dense(2, activation="softmax")(headModel)

# place the head FC model on top of the base model (this will become
# the actual model we will train)
model = Model(inputs=baseModel.input, outputs=headModel)

# loop over all layers in the base model and freeze them so they will
# *not* be updated during the first training process
for layer in baseModel.layers:
    layer.trainable = False

INIT_LR = 1e-3
EPOCHS = 25
BS = 8

# compile our model
print("[INFO] compiling model...")
opt = Adam(lr=INIT_LR, decay=INIT_LR / EPOCHS)
model.compile(loss="binary_crossentropy", optimizer=opt,
    metrics=["accuracy"])


# train the head of the network
print("[INFO] training head...")
H = model.fit_generator(
    trainAug.flow(trainX, trainY, batch_size=BS),
    steps_per_epoch=len(trainX) // BS,
    validation_data=(testX, testY),
    validation_steps=len(testX) // BS,
    epochs=EPOCHS)


# make predictions on the testing set
print("[INFO] evaluating network...")
predIdxs = model.predict(testX, batch_size=BS)

# for each image in the testing set we need to find the index of the
# label with corresponding largest predicted probability
predIdxs = np.argmax(predIdxs, axis=1)



imagePath_1 = os.path.normpath('D:/Classification-master/data_two_class/test/cat/NORMAL2-IM- 
1396-0001.jpeg')
label_1=imagePath_1.split(os.sep)[-2]
image_pred = cv2.imread(imagePath)
image_pred = cv2.cvtColor(image_pred, cv2.COLOR_BGR2RGB)
img_pred = cv2.resize(image_pred, (224, 224))
img_pred = np.array(img_pred) / 255.0



rslt = model.predict(img_pred.reshape(1,224,224,3))
#decode_predictions(rslt)

我正在使用上面的代碼使用 keras 和 tensorflow 對圖像進行分類,但是我很難理解預測的標簽,因為我已經對它們進行了一次熱編碼,現在當我實際預測單個圖像時,它顯示的是一個數組的兩個概率。 使用 argmax 函數后,我得到 0 或 1 並且無法理解它的含義。

In [209]: rslt
Out[209]: array([[0.9550967 , 0.04490325]], dtype=float32)

rslt = np.argmax(rslt)
Out[219]: 0

如果有人可以展示一種方法,我可以在數據處理階段的編碼期間看到哪個類標簽將變為“0”/“1”,以及在驗證(testY)和當我預測單個圖像時,圖像的類標簽具有。

問候,

蘇布拉

Softmax函數輸出代表概率的數字,每個數字的值在 0 和 1 之間的概率有效值范圍內。 范圍表示為 [0,1] 。 數字為零或正數。 整個輸出向量總和為 1。

argmax返回沿軸的最大值的索引。

那么打印您的標簽並了解您的第一個和第二個索引代表什么?

由於您在最后一層使用了 softmax,它給出了圖像屬於不同類別的概率。 在您的情況下,它是 2 個類,因此它顯示圖像屬於這兩個類的概率。 如果將概率 0.9550967 和 0.04490325 相加,則總和為 1。

0.9550967 + 0.04490325 = 1

np.argmax(rslt)返回具有最大值的索引。

這是一個例子 (1) -打印標簽並理解第一個和第二個索引代表什么 -

import numpy as np
from sklearn.preprocessing import LabelBinarizer
from tensorflow.keras.utils import to_categorical

# define example
data = ['dog', 'dog', 'cat', 'dog', 'cat', 'cat', 'dog', 'cat', 'dog', 'dog']

values = np.array(data)

#Binary encode
lb = LabelBinarizer()

labels = lb.fit_transform(values)
labels = to_categorical(labels)
print("which position represents for cat and dog?:")
print("Data is:",data)
print(labels)

輸出將是 -這里第一個索引是針對貓的,第二個是針對狗的。

which position represents for cat and dog?:
Data is: ['dog', 'dog', 'cat', 'dog', 'cat', 'cat', 'dog', 'cat', 'dog', 'dog']
[[0. 1.]
 [0. 1.]
 [1. 0.]
 [0. 1.]
 [1. 0.]
 [1. 0.]
 [0. 1.]
 [1. 0.]
 [0. 1.]
 [0. 1.]]

現在讓我們了解argmax與您的 softmax array([[0.9550967 , 0.04490325]] values

示例 (2):將按原樣使用您的 softmax 輸出。

import numpy as np
rslt = np.array([[0.9550967,0.04490325]])
rslt = np.argmax(rslt)
print(rslt)

輸出應為 0,因為第一個索引具有更高的值。 所以它的貓按照上面的例子(1)。

0

示例 (3):將交換您的 softmax 輸出。

import numpy as np
rslt = np.array([[0.04490325,0.9550967]])
rslt = np.argmax(rslt)
print(rslt)

輸出應為 1,因為第二個索引具有更高的值。 所以它的狗按照上面的例子(1)。

1

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM