[英]Polygon bounding box with Tensorflow
I understand that the tensorflow API to train custom object detection datasets uses only rectangular bounding boxes, namely xmin, xmax, ymax, ymin.我知道 tensorflow API 训练自定义 object 检测数据集仅使用矩形边界框,即 xmin、xmax、ymax、ymin。 I also understand that a polygon bounding box will greatly improve detection accuracy as it removes any unnecessary information within the bounding box allowing for a far superior training dataset.
我也知道多边形边界框将大大提高检测精度,因为它会删除边界框内的任何不必要的信息,从而获得更好的训练数据集。 I currently use labelImg to bound all my images for training and it does offer polygon boxes.
我目前使用 labelImg 来绑定我所有的图像进行训练,它确实提供了多边形框。 My question is, is there a way to modify the code in the tensorflow API to work with polygon boxes as opposed to just rectangle boxes?
我的问题是,有没有办法修改 tensorflow API 中的代码以使用多边形框而不是矩形框?
No, at this point you may be more interested at semantic segmentation like Mask R-CNN (not implemented in Tensorflow's object detection API).The models in the API have specific differentiable layers (thus trainable) that find bounding boxes. 不,在这一点上,您可能对诸如Mask R-CNN(未在Tensorflow的对象检测API中实现)之类的语义细分更感兴趣.API中的模型具有特定的可区分层(因此可训练),可以找到边界框。 The degrees of freedom on a polygon model would be more complicated.
多边形模型的自由度将更加复杂。 Mask R-CNN somewhat solves the polygon problem by identifying the object, then segmenting what within the bounding box is actually the object vs background.
遮罩R-CNN可以通过识别对象来解决多边形问题,然后对边界框内的对象和背景进行分割。
Here's some introduction to some of popular algorithms used in object detection and how they work: 以下是一些用于对象检测的流行算法及其工作原理的简介:
https://blog.athelas.com/a-brief-history-of-cnns-in-image-segmentation-from-r-cnn-to-mask-r-cnn-34ea83205de4?gi=b386f4274020 https://blog.athelas.com/a-brief-history-of-cnns-in-image-segmentation-from-r-cnn-to-mask-r-cnn-34ea83205de4?gi=b386f4274020
Well, I modified my polygon.好吧,我修改了我的多边形。 I simulated polygon inscribed in rectangle with the same dimensions of my image (filled with zeroes like [w=0,h=0,collor=0]) and used as mask for my training.
我模拟了矩形内接的多边形,其尺寸与我的图像相同(填充零,如 [w=0,h=0,collor=0])并用作我训练的掩码。
it's a snippet from where it occurs in my code:这是我的代码中出现的一个片段:
im_w, im_h, color = img.shape
#
...
image_masked, img_masks = mask_polygon_over_image(image, all_lists_poligon)
#
#convert to retangular over gray image
tf_input_image = tf.image.resize(np.uint8(img), (im_w, im_h))
tf_input_mask = tf.image.resize(np.uint8(img_masks), (im_w, im_h))
I'm testing this now.我现在正在测试这个。 The
all_lists_poligon
comes from an annotated JSON file all_lists_poligon
来自带注释的 JSON 文件
Just worked for me!刚刚为我工作!
Take a look in the mask_polygon_over_image
function:看看
mask_polygon_over_image
function:
#
def mask_polygon_over_image(image, all_lists_poligon):
new_image = Image.new(mode="RGB", size =(3000, 3000), color = (255, 255, 255))
draw = ImageDraw.Draw(image,'RGBA')
draw_new = ImageDraw.Draw(new_image)
for list_polygon in all_lists_poligon:
draw.polygon(list_polygon, fill=(50, 255, 50, 105),outline="yellow")
draw_new.polygon(list_polygon, fill=(0, 185, 0),outline="yellow")
return image, new_image
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.