简体   繁体   English

使用 yolo4.cfg 进行 tensorflow 2.2 训练中的形状不匹配问题

[英]Shape mismatch problem in tensorflow 2.2 training using yolo4.cfg

I recently added a new feature to my yolov3 implementation which is models are currently loaded directly from DarkNet cfg files for convenience, I tested the code with yolov3 configuration as well as yolov4 configuration they both work just fine except for v4 training.我最近在我的yolov3 实现中添加了一个新功能,为了方便起见,模型当前直接从 DarkNet cfg 文件中加载,我使用 yolov3 配置和 yolov4 配置测试了代码,除了 v4 训练外,它们都可以正常工作。 Shortly after I start training I get a shapes mismatch error and I'll be very grateful if someone can help me get rid of the error and get to finally complete my project.在我开始训练后不久,我遇到了形状不匹配错误,如果有人能帮助我摆脱错误并最终完成我的项目,我将不胜感激。 Please let me know in the comments and I will provide you with any resources you need to help me with fixing the problem and thank you in advance...请在评论中告诉我,我将为您提供帮助我解决问题所需的任何资源,并提前感谢您...

This is what I run in order to reproduce:这是我为了重现而运行的:

if __name__ == '__main__':
    tr = Trainer((608, 608, 3),
                 '../Config/yolo4.cfg',
                 '../Config/beverly_hills.txt',
                 1344, 756, score_threshold=0.1,
                 train_tf_record='../Data/TFRecords/beverly_hills_train.tfrecord',
                 valid_tf_record='../Data/TFRecords/beverly_hills_test.tfrecord')

    tr.train(
        100,
        8,
        1e-3,
        dataset_name='beverly_hills',
        merge_evaluation=False,
        n_epoch_eval=10,
        clear_outputs=True
    )
L

links to files you need:您需要的文件的链接:

Here is the error message:这是错误消息:

Traceback (most recent call last):
  File "trainer.py", line 629, in <module>
    clear_outputs=True
  File "../Helpers/utils.py", line 62, in wrapper
    result = func(*args, **kwargs)
  File "trainer.py", line 490, in train
    validation_data=valid_dataset,
  File "/root/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 108, in _method_wrapper
    return method(self, *args, **kwargs)
  File "/root/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 1090, in fit
    tmp_logs = train_function(iterator)
  File "/root/.local/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 766, in __call__
    result = self._call(*args, **kwds)
  File "/root/.local/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 826, in _call
    return self._stateless_fn(*args, **kwds)
  File "/root/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 2811, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "/root/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 1838, in _filtered_call
    cancellation_manager=cancellation_manager)
  File "/root/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 1914, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager))
  File "/root/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 549, in call
    ctx=ctx)
  File "/root/.local/lib/python3.6/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
    inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError:  Incompatible shapes: [4,76,76,3,1] vs. [4,19,19,3,1]
     [[node yolo_loss/logistic_loss/mul (defined at ../Helpers/utils.py:260) ]] [Op:__inference_train_function_38735]

Errors may have originated from an input operation.
Input Source operations connected to node yolo_loss/logistic_loss/mul:
 yolo_loss/split_1 (defined at ../Helpers/utils.py:222) 
 yolo_loss/split (defined at ../Helpers/utils.py:196)

Function call stack:
train_function

And when I change the batch_size to 8 instead of 4, the error mutates into the following(the error source changes):当我将 batch_size 更改为 8 而不是 4 时,错误变为以下内容(错误源更改):

Traceback (most recent call last):
  File "/Users/emadboctor/Desktop/Code/yolov3-keras-tf2/Main/trainer.py", line 693, in <module>
    clear_outputs=True,
  File "/Users/emadboctor/Desktop/Code/yolov3-keras-tf2/Helpers/utils.py", line 62, in wrapper
    result = func(*args, **kwargs)
  File "/Users/emadboctor/Desktop/Code/yolov3-keras-tf2/Main/trainer.py", line 526, in train
    validation_data=valid_dataset,
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 66, in _method_wrapper
    return method(self, *args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 848, in fit
    tmp_logs = train_function(iterator)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 580, in __call__
    result = self._call(*args, **kwds)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 644, in _call
    return self._stateless_fn(*args, **kwds)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 2420, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1665, in _filtered_call
    self.captured_inputs)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1746, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager))
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 598, in call
    ctx=ctx)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
    inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError:  Incompatible shapes: [8,13,13,3,2] vs. [8,52,52,3,2]
     [[node gradient_tape/yolo_loss/sub_5/BroadcastGradientArgs (defined at Users/emadboctor/Desktop/Code/yolov3-keras-tf2/Main/trainer.py:526) ]] [Op:__inference_train_function_42744]

Function call stack:
train_function

Adding this line in models.py solved the shapes problem and the training started as expected:models.py中添加这一行解决了形状问题,训练按预期开始:

if '4' in self.model_configuration:
    self.output_layers.reverse()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM