简体   繁体   中英

Keras - how to pass a array of images to ImageDataGenerator.flow

I'm learning about image classification in keras. I've downloaded sample dataset of donuts and waffles, but they differ in size. To standardise their size I'm loading images from their directories, resize them and store them in numpy arrays:

test_data_dir = 'v_data/train/donuts_and_waffles/'
validation_data_dir = 'v_data/test/donuts_and_waffles/'

loaded_test_donuts = list()
for filename in listdir(test_data_dir + 'donuts/'):
    image1 = Image.open(test_data_dir + 'donuts/' + filename)
    img_resized = image1.resize((224,224))
    img_data = asarray(img_resized)
    loaded_test_donuts.append(img_data)

loaded_test_waffles = list()
for filename in listdir(test_data_dir + 'waffles/'):
    image1 = Image.open(test_data_dir + 'waffles/' + filename)
    img_resized = image1.resize((224,224))
    img_data = asarray(img_resized)
    loaded_test_waffles.append(img_data)

loaded_validation_donuts = list()
for filename in listdir(validation_data_dir + 'donuts/'):
    image1 = Image.open(validation_data_dir + 'donuts/' + filename)
    img_resized = image1.resize((224,224))
    img_data = asarray(img_resized)
    loaded_validation_donuts.append(img_data)

loaded_validation_waffles = list()
for filename in listdir(validation_data_dir + 'waffles/'):
    image1 = Image.open(validation_data_dir + 'waffles/' + filename)
    img_resized = image1.resize((224,224))
    img_data = asarray(img_resized)
    loaded_validation_waffles.append(img_data)

test_data = list()
validation_data = list()

test_data.append(np.array(loaded_test_donuts))
test_data.append(np.array(loaded_test_waffles))
validation_data.append(np.array(loaded_validation_donuts))
validation_data.append(np.array(loaded_validation_waffles))

test_data = np.array(test_data)
validation_data = np.array(validation_data)

Then I want to create an ImageDataGenerator for my data:

train_datagen = ImageDataGenerator( 
    rescale=1. / 255, 
    shear_range=0.2, 
    zoom_range=0.2, 
    horizontal_flip=True) 

test_datagen = ImageDataGenerator(rescale=1. / 255) 

train_generator = train_datagen.flow( 
    #how can I pass here test_data to make it work (along with which parameters)
) 

validation_generator = test_datagen.flow(
    #how can I pass here validation_data to make it work (along with which    parameters)
) 

How to achieve that? I have tried like this:

train_generator = train_datagen.flow( 
    test_data,                                  #does not work
    batch_size=batch_size) 

validation_generator = test_datagen.flow( 
    validation_data,                            #does not work
    batch_size=batch_size) 

but then I get this error:

Traceback (most recent call last):
...

ValueError: ('Input data in `NumpyArrayIterator` should have rank 4. You passed an array with shape', (2, 770, 224, 224, 3))

It's hard to say what does not work without error message, but I assume the problem is that you pass lists to your ImageDataGenerators. You can fix this easily by converting your lists to numpy-arrays:

test_data = list()
validation_data = list()

test_data.append(np.array(loaded_test_donuts))
test_data.append(np.array(loaded_test_waffles))
validation_data.append(np.array(loaded_validation_donuts))
validation_data.append(np.array(loaded_validation_waffles))

test_data = np.array(test_data)
validation_data = np.array(validation_data)

Edit: A better way, stacking instead of appending to lists and converting

test_data = np.vstack((np.array(loaded_test_donuts),np.array(loaded_test_waffles)))

validation_data = np.vstack((np.array(loaded_validation_donuts),np.array(loaded_validation_waffles)))

What I would recommend is that you create a folder where you have n folders representing your classes such as "dog", "cat" and do the preprocessing step first and then save the produced images like this:

from PIL import Image
import glob
from keras.preprocessing import image


W=500
H=825

for folder in glob.glob("*"):     #goes through every folder 
ims = glob.glob(folder+ "\\*.png")   #reads image names from folder assuming images are png
for im in ims:  
    img = Image.open(im)
    print(im)
    if (img.size != (W, H)):
        imgr = process(img, W, H) # where "process" is reszing in your case
        imgr.save(im)

then spilt your data into train and validation folders and do:

traingen = image.ImageDataGenerator(rescale=1./255)
validationgen = image.ImageDataGenerator(rescale=1./255)

train = traingen.flow_from_directory("train",target_size=(H,W), batch_size=s,shuffle=True)
val = validationgen.flow_from_directory("validation",target_size=(500, 825), batch_size=32, shuffle=False)

You test_data does not have the correct shape, you have to convert into an array of shape 4 for example (770, 224, 224, 3), 770 refers the number of the images, 224x224 refers to the size of the images (pixels) and the 3 refers the color of the images.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM