简体   繁体   English

如何使用 Tensorflow 进行自定义损失的分布式训练?

[英]How to use distributed training with a custom loss using Tensorflow?

I have a transformer model I'd like to train distributed across several workers on the Google Cloud AI Platform using Actor-Critic RL for training.我有一个变压器 model 我想使用 Actor-Critic RL 来训练分布在 Google Cloud AI 平台上的几个工人。 I have my data broken up into individual files by date and uploaded to Cloud Storage.我将我的数据按日期分解成单独的文件并上传到 Cloud Storage。 Since I'm using Actor-Critic RL, I have a custom loss function that calculates and applies the gradient.由于我使用的是 Actor-Critic RL,因此我有一个自定义损失 function 用于计算和应用梯度。 All the examples I've come across for distributed training make use of model.fit , which I'm not going to be able to do.我在分布式训练中遇到的所有示例都使用了model.fit ,我无法做到这一点。 I haven't been able to find any information on using a custom loss instead.我一直无法找到有关使用自定义损失的任何信息。

Along with distributing it across several machines, I'd like to know how to properly distribute training across several CPU cores as well.除了将它分布在多台机器上,我还想知道如何在多个 CPU 内核之间正确分布训练。 From my understanding model.fit takes care of this stuff normally.据我了解model.fit通常会处理这些东西。

Here's the custom loss function;这是自定义损失 function; right now it's the equivalent of a batch size of 1 I believe:我相信现在它相当于 1 的批量大小:

def learn(self, state_value_starting: tf.Tensor, probabilities: tf.Tensor, state_new: tf.Tensor,
            reward: tf.Tensor, is_done: tf.Tensor):
    with tf.GradientTape() as tape:
        state_value_starting = tf.squeeze(state_value_starting)
        state_value_new, _ = self.call(state_new)
        state_value_new = tf.squeeze(state_value_new)

        action_probabilities = tfp.distributions.Categorical(probs=probabilities)
        log_probability = action_probabilities.log_prob(self._last_action)

        delta = reward + (self._discount_factor * state_value_new * (1 - int(is_done))) - state_value_starting
        actor_loss = -log_probability * delta
        critic_loss = delta ** 2
        total_loss = actor_loss + critic_loss

    gradient = tape.gradient(total_loss, self.trainable_variables)
    self.optimizer.apply_gradients(zip(gradient, self.trainable_variables))

Tensorflow Model is provided with a practiced solution, defined in model_lib_v2.py . Tensorflow Model提供了实践解决方案,定义在model_lib_v2.py中。

See the function train_loop , the custom training loop is constructed makes use of见 function train_loop ,构建自定义训练循环利用

strategy = tf.compat.v2.distribute.get_strategy() #L501
with strategy.scope():
    training step ...

And custom loss in function eager_train_step .自定义损失在 function eager_train_step

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM