简体   繁体   中英

Input pipeline for semantic image segmentation (3 labels) with keras (TensforFlow backend) using flow_from_directory()

I am using keras (TensorFlow backend) and I am trying to understand how to bring in my labels/masks for image segmentation (3 labels) using flow_from_directory.

The train_images have the dimensions (144, 144, 144) - grayscale, uint8. The corresponding label_images have the same dimensions but here the value 1 represents label 1, value 2 = label 2, value 3 = label 3 and the value 0 shows unlabeled pixels.

Since this is semantic segmentation, classifying each pixel in the image requires using a pixel-wise cross-entropy loss function. And as I have read in some posts, keras (or TensorFlow) requires that my label_image/mask is one hot coded. Therefore I expect my label_images to be an image with 3 channels, where each pixel will consist of a binary vector. Example: [0, 1, 0].

How do I deal with the unlabeled pixels that are stored as 0? Should they be encoded as [0, 0, 0]?

But the question I have where I fail to find an answer is: How do I reshape/one-hot encode my label_images correctly? Is there a handy function in keras that lets me convert my image_labels?

from keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale=1. / 255)
label_datagen = ImageDataGenerator(rescale=1. / 255)

train_image_generator = train_datagen.flow_from_directory(
    directory='/train_images',
    target_size=(144, 144, 144),
    color_mode='grayscale',
    classes=None,
    class_mode=None,
    batch_size=4)

train_label_generator = label_datagen.flow_from_directory(
    directory='/label_images',
    target_size=(144, 144, 144),
    color_mode='grayscale',
    classes=None,
    class_mode=None,
    batch_size=4)

train_generator = zip(train_image_generator, train_label_generator)

Currently working on something very similar but with 10 classes. Still not entirely there yet, but as to you question about built-in functions for keras, checkout:

one_hot_array = keras.utils.to_categorical(array_of_label_data, nb_classes)

which creates a one-hot-vector of your mask/label data. So for your case, the expect output for say 100 masks would be (100, H, W, 3), where 3 is equal to the number of classes you're working with. What I'm not sure about is if you do or do not have a background in your mask, and also, how you're supposed to structure the folders for your data. Hope that helps though.

Also, your target_size is off, that's referring to the dimensions of your images (eg height and width). There shouldn't be a third value.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM