简体   繁体   English

如何从多个模型中获取张量并对它们求平均值?

[英]How to get tensor from multiple models and average them?

I am trying to average tensor of two model with identical structure but trained with different datasets.我正在尝试对具有相同结构但使用不同数据集进行训练的两个模型的平均张量。 The model are stored in ckpt file.该模型存储在 ckpt 文件中。

I tried to look at avg_checkpoints function from tensor2tensor but have no idea how to use it.我试图从 tensor2tensor 中查看avg_checkpoints 函数,但不知道如何使用它。

How do I solve the problem?我该如何解决问题?

from tensor2tensor.utils import avg_checkpoints

print(avg_checkpoints.checkpoint_exists("/"))
#I got true from console
#I have copied final ckpt from different model to the root file

avg_checkpoint.main(?)
#no idea what to replace the ? with

avg_checkpoints.py is an executable script, so you can use it from the command line, eg: avg_checkpoints.py是一个可执行脚本,因此您可以从命令行使用它,例如:

python utils/avg_checkpoints.py
  --checkpoints path/to/checkpoint1,path/to/checkpoint2
  --num_last_checkpoints 2
  --output_path where/to/save/the/output

Note that if the two checkpoints were trained on different datasets from scratch, the averaging would not work.请注意,如果从头开始在不同的数据集上训练两个检查点,则平均将不起作用。 If you had a single pre-trained model which you just fine-tuned on two different datasets, then the averaging could work.如果你有一个单独的预训练模型,你只是在两个不同的数据集上进行了微调,那么平均就可以工作。

You can average more than two checkpoints.您可以平均超过两个检查点。 A hacky, but simple way to add weights for each checkpoint is to include it multiple times in --checkpoints (and increase num_last_checkpoints accordingly).为每个检查点添加权重的一种简单但简单的方法是将其多次包含在--checkpoints (并相应地增加num_last_checkpoints )。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM