如何使用 Keras 将图像文件夹转换为 X 和 Y 批次？

Question

Say I have a folder of images such as:假设我有一个图像文件夹，例如：

PetData
|
Dog - images
|
Cat - images

How would I transform it into (x_train, y_train),(x_test, y_test) format?我如何将其转换为 (x_train, y_train),(x_test, y_test) 格式？ I see this format used extensively with the MNIST dataset which goes like:我看到这种格式广泛用于 MNIST 数据集，如下所示：

mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()

However i'd like to do this with my own folder of images.但是我想用我自己的图像文件夹来做到这一点。

Answer 1

mnist.load_data() returns two tuples with the content of the images and the labels in uint8 arrays. mnist.load_data()返回两个元组，其中包含图像内容和uint8 arrays 中的标签。 You should get those arrays by loading the images of your folders (you can use modules such as PIL.Image in order to load X, your y is just the set labels provided by the folder name).您应该通过加载文件夹的图像来获取那些 arrays（您可以使用PIL.Image等模块来加载 X，您的 y 只是文件夹名称提供的设置标签）。

PIL.Image use example: PIL.Image使用示例：

from PIL import Image
import glob

for infile in glob.glob("*.jpg"):
    im = Image.open(infile)

To split train/test you can use sklearn.model_selection.train_test_split :要拆分训练/测试，您可以使用sklearn.model_selection.train_test_split ：

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)

Answer 2

Suppose your train or test images are in folder PetData each class in separate folder as Dog and Cat .假设您的火车或测试图像位于PetData文件夹中，每个 class 位于单独的文件夹中，分别为Dog和Cat 。 You can use ImageDataGenerator to prepare your train/test data as below:您可以使用ImageDataGenerator准备您的训练/测试数据，如下所示：

from keras import layers
from keras import models

model = models.Sequential()
#define your model
#..........
#......


#Using ImageDataGenerator to read images from directories
from keras.preprocessing.image import ImageDataGenerator
train_dir = "PetData/"
#PetData/Dog/  : dog images
#PetData/Cat/  : cat images
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory( train_dir, target_size=(150, 150), batch_size=20)

history = model.fit_generator( train_generator, steps_per_epoch=100, epochs=30) #fit the model using train_generator

Hope this helps!希望这可以帮助！

Answer 3

If you want to import images from a folder in your computer you can import images 1 by 1 from the folder in insert the in a list.如果要从计算机中的文件夹中导入图像，可以从插入列表的文件夹中逐张导入图像。

Your folder format is as you have shown:您的文件夹格式如您所示：

PetData
|
Dog - images
|
Cat - images

Assume path is a variable storing the address of PetData folder.假设path是存储 PetData 文件夹地址的变量。 We will use OpenCV to import images but you can use other libraries as well.我们将使用 OpenCV 导入图像，但您也可以使用其他库。

data = []
label = []
Files = ['Dog', 'Cat']
label_val = 0

for files in Files:
    cpath = os.path.join(path, files)
    cpath = os.path.join(cpath, 'images')
    for img in os.listdir(cpath):
        image_array = cv2.imread(os.path.join(cpath, img), cv2.IMREAD_COLOR)
        data.append(image_array)
        label.append(label_val)
    label_val = 1

Convert the list to a numpy array.将列表转换为 numpy 数组。

data = np.asarray(data)
label = np.asarray(label)

After importing the images you can use train_test_split to split the data for training and testing.导入图像后，您可以使用train_test_split拆分数据以进行训练和测试。

X_train, X_test, y_train, y_test = train_test_split(data, label, test_size=0.33, random_state=42)

如何使用 Keras 将图像文件夹转换为 X 和 Y 批次？

问题描述

3 个解决方案

解决方案1
2 已采纳 2020-07-09 00:49:21

解决方案2
1 2020-07-09 00:47:59

解决方案3
1 2020-07-09 05:29:30

如何使用 Keras 将图像文件夹转换为 X 和 Y 批次？

问题描述

3 个解决方案

解决方案1 2 已采纳 2020-07-09 00:49:21

解决方案2 1 2020-07-09 00:47:59

解决方案3 1 2020-07-09 05:29:30

解决方案1
2 已采纳 2020-07-09 00:49:21

解决方案2
1 2020-07-09 00:47:59

解决方案3
1 2020-07-09 05:29:30