简体   繁体   中英

How to use tensorflow dataset correctly for multiple input layers with keras

I have multiple input layers (20 input layers) and I want to use a tf.dataset for feeding the model. The batch_size is 16. Unfortunately model.fit(train_dataset, epochs=5) is throwing the following error:

ValueError: Error when checking model input: the list of numpy arrays that you are passing to your model is not the size the model expected. Expected to see 20 array(s), for inputs ['input_2', ... , 'input_21'] but instead got the following list of 1 arrays: [<tf.Tensor 'args_0:0' shape=(None, 20, 512, 512, 3) dtype=int32>]...

I assume, that keras wants a shape like (20,None,512,512,3) . Has someone an idea to this problem or how to use tf.datasets correctly for a model with multiple input layers?

def read_tfrecord(bin_data):
    for i in feature_map_dict:
        label_seq[i] = tf_input_feature_selector(feature_map_dict[i])
    img_seq = {'images': tf.io.FixedLenSequenceFeature([], dtype=tf.string)}
    cont, seq = tf.io.parse_single_sequence_example(serialized=bin_data, context_features=label_seq, sequence_features=img_seq)
    image_raw = seq['images']
    images = decode_image_raw(image_raw)    
    images = tf.reshape(images, [20,512,512,3])
    images = preprocess_input(images)
    label = cont["label"]
    return images, label

def get_dataset(tfrecord_path):
    dataset = tf.data.TFRecordDataset(filenames=tfrecord_path)
    dataset = dataset.map(read_tfrecord)
    dataset = dataset.prefetch(buffer_size=AUTOTUNE)
    dataset = dataset.batch(BATCH_SIZE)
    return dataset

def create_model():
    nets =[]
    inputs=[]
    # Set up base model
    base_ResNet50 = ResNet50(weights='imagenet', include_top= False, input_shape=(512, 512, 3))    
    for images_idx in list(range(0,20)):
        x = Input(shape=(512,512,3))
        inputs.append(x)
        x = base_ResNet50(x)
        nets.append(x)
    maxpooling = tf.reduce_max(nets, [0])
    flatten = Flatten()(maxpooling)
    dense_1 = Dense(10,activation='sigmoid')(flatten)
    predictions = Dense(1,activation='sigmoid')(dense_1)
    model = Model(inputs=inputs, outputs=predictions)
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

Thanks in advance.

With a small modification of Niteya's idea the test toy-model runs the training. Great!

But I am still not happy with this solution, because all 20 images belong to one object and so far I understand this solution I have to create 21 tfrecords. By that, the informations of one object will be distributed overall these files. I would like to have a more easy solution, where all the informations of an object are in only one tfrecord.

This testing toy-model works!!!

import tensorflow as tf
import tensorflow.keras as keras
from tensorflow.keras.layers import Input, Flatten, Dense
from tensorflow.keras.models import Model

x_1 = Input(shape=(100,100,3))
x_2 = Input(shape=(100,100,3))
inputs = [x_1,x_2]
flatten_1 = Flatten()(x_1)
flatten_2 = Flatten()(x_2)
dense_1 = Dense(50,activation='sigmoid')
d1_1 = dense_1(flatten_1)
d1_2 = dense_1(flatten_2)
nets =[d1_1,d1_2]
maxpooling = tf.reduce_max(nets, [0])
d2 = Dense(10,activation='sigmoid')(maxpooling)
predictions = Dense(1,activation='sigmoid')(d2)
model = Model(inputs=inputs, outputs=predictions)

model.compile(loss='binary_crossentropy', optimizer='adam',
        metrics=['accuracy'])

for layer in model.layers:
    print(layer.name)

input_d = tf.data.Dataset.zip(tuple(tf.data.Dataset.from_tensors(tf.random.normal([16,100,100,3])) for i in range(2))) 
output = tf.data.Dataset.from_tensors(tf.ones(16))
dataset = tf.data.Dataset.zip((input_d, output))

model.fit(dataset,epochs=5)

Using Niteya's second idea with the function tf.split is a good solution. Niteya, thank you very much.

inputs = Input(shape=(20,512,512,3))
for x in tf.split(inputs,num_or_size_splits=20, axis=1):
        x = tf.reshape(x,[-1,512,512,3])
        x = base_ResNet50(x)
        nets.append(x)```  
and 

BATCH_SIZE=1 model.fit(train_dataset, steps_per_epoch=10, epochs=5)

Have you considered using tf.data.Dataset.zip ? Your model needs to be fed 20 different inputs so zip them together, then zip that dataset with the output, which also needs to be zipped.

I am using random inputs but you should get the method from it.

    input_d = tf.data.Dataset.zip(tuple(tf.data.Dataset.from_tensors(tf.random.normal([16, 512,512,3])) for i in range(20))) 
    output = tf.data.Dataset.from_tensors(tf.ones(16))
    dataset = tf.data.Dataset.zip((input_d, output))

https://www.tensorflow.org/api_docs/python/tf/data/Dataset#zip

Edit: Using split, something like this could be done. Pass the entire the dataset, and then split it(You may need to use axis).

    for i in tf.split(tf_record_Input, 20):
        x = base_ResNet50(i)
        nets.append(x)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM