简体   繁体   English

Tensorflow对象检测API Faster-RCNN收敛但检测不准确

[英]Tensorflow Object Detection API Faster-RCNN converges but detection is innacurate

I am trying to use the Tensorflow Object detection API to recognize the Guinness logo. 我正在尝试使用Tensorflow对象检测API来识别Guinness徽标。 The process is similar to that shown here - https://towardsdatascience.com/building-a-toy-detector-with-tensorflow-object-detection-api-63c0fdf2ac95 . 该过程类似于此处显示的过程-https: //towardsdatascience.com/building-a-toy-detector-with-tensorflow-object-detection-api-63c0fdf2ac95

I have prepared 100 training images, which I use augmentation to reach a total of around 5000 training images. 我已经准备了100幅训练图像,使用增强可以达到大约5000幅训练图像。 (Using Imgaug). (使用Imgaug)。 In tensorboard when training I see what looks like a good learning curve, reaching a loss of < 0.1 , but when I export and test the graph I get lots of false positives and very inaccurate results. 在张量板上进行训练时,我看到了一条看起来不错的学习曲线,损失了<0.1,但是当我导出并测试该图时,我得到了很多误报和非常不准确的结果。 I am trying to work out why this is. 我正在尝试找出原因。

Tensorboard performance graphs Tensorboard性能图

在此处输入图片说明

Bad detection example 错误的检测示例 在此处输入图片说明

Note, to automate labelling of my images I cropped the original 100 neatly around the logo, then I programmatically place them on a random background image, with the bounding box around it. 请注意,为了自动标记图像,我在徽标周围整齐地裁剪了原始的100张,然后以编程方式将其放置在随机背景图像上,并带有边框。 Example - 范例-

Like so - 像这样- 在此处输入图片说明

All the training images are 800x600, but the actual bounding box and logo would be much smaller as you can see. 所有的训练图像均为800x600,但是实际的边界框和徽标会小得多,如您所见。

And here is the xml annotation file for that same image - 这是同一张图片的xml注释文件-

 <?xml version="1.0" encoding="utf-8"?> <annotation> <folder>images</folder> <filename>57.png</filename> <path>model\\images\\57.png</path> <source> <database>Unknown</database> </source> <size> <width>800</width> <height>600</height> <depth>3</depth> </size> <segmented>0</segmented> <object> <name>guinness</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>225</xmin> <ymin>329</ymin> <xmax>516</xmax> <ymax>466</ymax> </bndbox> </object> </annotation> 

Does anybody know why tensorflow would be correctly classifying the test images, but also have inaccurate detection when I test on a real world image? 有人知道为什么tensorflow可以正确地对测试图像进​​行分类,但是当我在真实世界的图像上进行测试时,检测结果也不准确吗? Any advice is welcome, and feel free to ask for more information. 欢迎任何建议,并随时要求更多信息。

Couple of thoughts 几个想法

  • Is your test image also of same size 800*600? 您的测试图像是否也尺寸为800 * 600?

  • You probably want to play with the image_resizer value in the config file 您可能要使用配置文件中的image_resizer值

Eventually I gave up the method of placing my logo images onto a random background, instead I manually labelled them, and then used image augmentation to increase my training set size. 最终,我放弃了将徽标图像放置在随机背景上的方法,而是手动标记了它们,然后使用图像增强来增加训练集的大小。 This seemed to greatly improve my results. 这似乎大大改善了我的结果。 I think this has something to do with a contextually accurate background actually being quite important in training. 我认为这与上下文准确的背景有关,实际上在培训中非常重要。

Hopefully this is helpful to some, thanks for the help. 希望这对某些人有帮助,谢谢您的帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM