简体   繁体   English

如何在多输出 model 训练期间加载数据,而无需在 Keras 中进行迭代?

[英]How to load data during training of a multi-output model without iteration in Keras?

I have a Keras model with 1 input and 2 outputs in TensorFlow 2. When calling model.fit I want to pass dataset as x=train_dataset and call model.fit once. I have a Keras model with 1 input and 2 outputs in TensorFlow 2. When calling model.fit I want to pass dataset as x=train_dataset and call model.fit once. The train_dataset is made with tf.data.Dataset.from_generator which yields: x1, y1, y2. train_datasettf.data.Dataset.from_generator生成:x1, y1, y2。

The only way I can run training is the following:我可以进行培训的唯一方法是:

for x1, y1,y2 in train_dataset:
    model.fit(x=x1, y=[y1,y2],...)

How to tell TensorFlow to unpack variables and train without the explicit for loop?如何告诉 TensorFlow 解包变量并在没有显式for循环的情况下进行训练? Using the for loop makes many things less practical, as well as usage of train_on_batch .使用for循环使许多事情变得不那么实用,以及train_on_batch的使用。

If I want to run model.fit(train_dataset, ...) the function doesn't understand what x and y are, even the model is defined like:如果我想运行model.fit(train_dataset, ...) function 不明白xy是什么,甚至 model 的定义如下:

model = Model(name ='Joined_Model',inputs=self.x, outputs=[self.network.y1, self.network.y2])

It throws an error that it is expecting 2 targets while getting 1, even the dataset has 3 variables, which can be iterated trough in the loop.它会抛出一个错误,即在获得 1 时期望 2 个目标,即使数据集有 3 个变量,也可以在循环中迭代。

The dataset and mini-batch are generated as:数据集和小批量生成为:

def dataset_joined(self, n_epochs, buffer_size=32):
    dataset = tf.data.Dataset.from_generator(
        self.mbatch_gen_joined,
        (tf.float32, tf.float32,tf.int32),
        (tf.TensorShape([None, None, self.n_feat]),
            tf.TensorShape([None, None, self.n_feat]),
            tf.TensorShape([None, None])),
        [tf.constant(n_epochs)]
        )
    dataset = dataset.prefetch(buffer_size)
    return dataset

    def mbatch_gen_joined(self, n_epochs):
    for _ in range(n_epochs):
        random.shuffle(self.train_s_list)
        start_idx, end_idx = 0, self.mbatch_size
        for _ in range(self.n_iter):
            s_mbatch_list = self.train_s_list[start_idx:end_idx]
            d_mbatch_list = random.sample(self.train_d_list, end_idx-start_idx)
            s_mbatch, d_mbatch, s_mbatch_len, d_mbatch_len, snr_mbatch, label_mbatch, _ = \
                self.wav_batch(s_mbatch_list, d_mbatch_list)
            x_STMS_mbatch, xi_bar_mbatch, _ = \
                self.training_example(s_mbatch, d_mbatch, s_mbatch_len,
                d_mbatch_len, snr_mbatch)
            #seq_mask_mbatch = tf.cast(tf.sequence_mask(n_frames_mbatch), tf.float32)
            start_idx += self.mbatch_size; end_idx += self.mbatch_size
            if end_idx > self.n_examples: end_idx = self.n_examples

            yield x_STMS_mbatch, xi_bar_mbatch, label_mbatch

Keras models expect the Python generators or tf.data.Dataset objects provide the input data as a tuple with the format of (input_data, target_data) (or (input_data, target_data, sample_weights) ). Keras 模型期望 Python 生成器或tf.data.Dataset对象将输入数据作为元组提供,格式为(input_data, target_data) (或(input_data, target_data, sample_weights) )。 Each of input_data or target_data could and should be a list/tuple if the model has multiple input/output layers.如果 model 具有多个输入/输出层,则每个input_datatarget_data都可以而且应该是一个列表/元组。 Therefore, in your code, the generated data should also be compatible with this expected format:因此,在您的代码中,生成的数据也应该与这种预期格式兼容:

yield x_STMS_mbatch, (xi_bar_mbatch, label_mbatch)  # <- the second element is a tuple itself

Also, this should be considered in the arguments passed to the from_generator method as well:此外,在传递给from_generator方法的 arguments 中也应考虑这一点:

dataset = tf.data.Dataset.from_generator(
    self.mbatch_gen_joined,
    output_types=(
        tf.float32,
        (tf.float32, tf.int32)
    ),
    output_shapes=(
        tf.TensorShape([None, None, self.n_feat]),
        (
            tf.TensorShape([None, None, self.n_feat]),
            tf.TensorShape([None, None])
        )
    ),
    args=(tf.constant(n_epochs),)
)

Use yield(x1, [y1,y2]) so model.fit will understand your generator output.使用yield(x1, [y1,y2])所以 model.fit 将了解您的生成器 output。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM