Keras Multi Input Network, using Images and structured data : How do I build the correct input data?

Question

I am building a multi input Network using the Keras functionnal API, but I struggle to find and understand the right format for my input data throw the network.

I have two main input:

One is an image, that goes throw a fine-tuned ResNet50 CNN
The second is a simple numpy array (X_train) containing metadata about the image (position and size of the image). This one goes throw a simple dense network.

I load the images from a dataframe, containing the metadata, and the filepath to the corresponding image. I use ImageDataGenerator and the flow_from_dataframe method to load my images:

datagen = ImageDataGenerator(preprocessing_function=preprocess_input)

train_flow = datagen.flow_from_dataframe(
                                        dataframe=df_train,
                                        x_col="cropped_img_filepath",
                                        y_col="category",
                                        batch_size=batch_size,
                                        shuffle=False,
                                        class_mode="categorical",
                                        target_size=(224,224)
                                        )

I can train the two networks separately using their own data, no problems until here.
The two output of the two distinct networks are then combined to a dense network to output a 10 digits probability vector:

# Create the input for the final dense network using the output of both the dense MLP and CNN
combinedInput = concatenate([cnn.output, mlp.output])

x = Dense(512, activation="relu")(combinedInput)
x = Dense(256, activation="relu")(x)
x = Dense(128, activation="relu")(x)
x = Dense(32, activation="relu")(x)
x = Dense(10, activation="softmax")(x)



model = Model(inputs=[cnn.input, mlp.input], outputs=x)

# Compile the model 
opt = Adam(lr=1e-3, decay=1e-3 / 200)
model.compile(loss="categorical_crossentropy",
              metrics=['accuracy'],
              optimizer=opt)

# Train the model
model_history = model.fit(x=(train_flow, X_train), 
                          y=y_train, 
                          epochs=1, 
                          batch_size=batch_size)

However, when I cannot train the overall network, I get the following error:

ValueError: Failed to find data adapter that can handle input: (<class 'tuple'> containing values of types {"<class 'keras_preprocessing.image.dataframe_iterator.DataFrameIterator'>", "<class 'numpy.ndarray'>"}), <class 'pandas.core.series.Series'>

I understand I am not using the correct input format for my input data.
I can train my CNN with the train_flow, and my dense network with X_train, so I was hoping this would work.

Do you have any idea of how to combine image data and nump array into a multi input array?

Thank you for all the information you can give me!

Answer 1

I finally found how to do it, inspiring me from the post @ Nima Aghli proposed.
Here is how I did that:

First instanciate the preprocessing function (for me the one used for ResNest50):

from tensorflow.keras.applications.resnet50 import ResNet50, preprocess_input

def preprocess_function(x):
    if x.ndim == 3:
        x = x[np.newaxis, :, :, :]
    return preprocess_input(x)

# Initializing the datagen, using the above function :
datagen = ImageDataGenerator(preprocessing_function=preprocess_input)

And then Define the Custom Data Generator that will yield randomly sampled array coupling image & metadata, whiule making sure not to be ever out of data (so that you can run on which ever number of epochs):

def createGenerator(dff, verif=False, batch_size=BATCH_SIZE):

    # Shuffles the dataframe, and so the batches as well
    dff = dff.sample(frac=1)
    
    # Shuffle=False is EXTREMELY important to keep order of image and coord
    flow = datagen.flow_from_dataframe(
                                        dataframe=dff,
                                        directory=None,
                                        x_col="cropped_img_filepath",
                                        y_col="category",
                                        batch_size=batch_size,
                                        shuffle=False,
                                        class_mode="categorical",
                                        target_size=(224,224),
                                        seed=42
                                      )
    idx = 0
    n = len(dff) - batch_size
    batch = 0
    while True : 
        # Get next batch of images
        X1 = flow.next()
        # idx to reach
        end = idx + X1[0].shape[0]
        # get next batch of lines from df
        X2 = dff[["x", "y", "w", "h"]][idx:end].to_numpy()
        dff_verif = dff[idx:end]
        # Updates the idx for the next batch
        idx = end
#         print("batch nb : ", batch, ",   batch_size : ", X1[0].shape[0])
        batch+=1
        # Checks if we are at the end of the dataframe
        if idx==len(dff):
#             print("END OF THE DATAFRAME\n")
            idx = 0
            

        # Yields the image, metadata & target batches
        if verif==True :
            yield [X1[0], X2], X1[1], dff_verif
        else :
            yield [X1[0], X2], X1[1]  #Yield both images, metadata and their mutual label

I voluntarily kept the commentaries as it helps grasps all the operations that are computed.
The main point/problem is to get images from all the dataframe, without ever getting short on images, and having batches of the same size.
Also, we have to be careful to the order of the images/metadata, so tht the right info is connected to the right image in the returned array.

Keras Multi Input Network, using Images and structured data : How do I build the correct input data?

Question

1 answers

solution1
0 ACCPTED 2020-08-25 14:20:00

Keras Multi Input Network, using Images and structured data : How do I build the correct input data?

Question

1 answers

solution1 0 ACCPTED 2020-08-25 14:20:00

solution1
0 ACCPTED 2020-08-25 14:20:00