带有 Tensorflow 的多边形边界框

Question

I understand that the tensorflow API to train custom object detection datasets uses only rectangular bounding boxes, namely xmin, xmax, ymax, ymin.我知道 tensorflow API 训练自定义 object 检测数据集仅使用矩形边界框，即 xmin、xmax、ymax、ymin。 I also understand that a polygon bounding box will greatly improve detection accuracy as it removes any unnecessary information within the bounding box allowing for a far superior training dataset.我也知道多边形边界框将大大提高检测精度，因为它会删除边界框内的任何不必要的信息，从而获得更好的训练数据集。 I currently use labelImg to bound all my images for training and it does offer polygon boxes.我目前使用 labelImg 来绑定我所有的图像进行训练，它确实提供了多边形框。 My question is, is there a way to modify the code in the tensorflow API to work with polygon boxes as opposed to just rectangle boxes?我的问题是，有没有办法修改 tensorflow API 中的代码以使用多边形框而不是矩形框？

Answer 1

No, at this point you may be more interested at semantic segmentation like Mask R-CNN (not implemented in Tensorflow's object detection API).The models in the API have specific differentiable layers (thus trainable) that find bounding boxes. 不，在这一点上，您可能对诸如Mask R-CNN（未在Tensorflow的对象检测API中实现）之类的语义细分更感兴趣.API中的模型具有特定的可区分层（因此可训练），可以找到边界框。 The degrees of freedom on a polygon model would be more complicated. 多边形模型的自由度将更加复杂。 Mask R-CNN somewhat solves the polygon problem by identifying the object, then segmenting what within the bounding box is actually the object vs background. 遮罩R-CNN可以通过识别对象来解决多边形问题，然后对边界框内的对象和背景进行分割。

Here's some introduction to some of popular algorithms used in object detection and how they work: 以下是一些用于对象检测的流行算法及其工作原理的简介：

https://blog.athelas.com/a-brief-history-of-cnns-in-image-segmentation-from-r-cnn-to-mask-r-cnn-34ea83205de4?gi=b386f4274020 https://blog.athelas.com/a-brief-history-of-cnns-in-image-segmentation-from-r-cnn-to-mask-r-cnn-34ea83205de4?gi=b386f4274020

Answer 2

Well, I modified my polygon.好吧，我修改了我的多边形。 I simulated polygon inscribed in rectangle with the same dimensions of my image (filled with zeroes like [w=0,h=0,collor=0]) and used as mask for my training.我模拟了矩形内接的多边形，其尺寸与我的图像相同（填充零，如 [w=0,h=0,collor=0]）并用作我训练的掩码。

it's a snippet from where it occurs in my code:这是我的代码中出现的一个片段：

im_w, im_h, color = img.shape
#
...
  image_masked, img_masks = mask_polygon_over_image(image, all_lists_poligon)
  #
  #convert to retangular over gray image
  tf_input_image = tf.image.resize(np.uint8(img), (im_w, im_h))
  tf_input_mask  = tf.image.resize(np.uint8(img_masks), (im_w, im_h))

I'm testing this now.我现在正在测试这个。 The all_lists_poligon comes from an annotated JSON file all_lists_poligon来自带注释的 JSON 文件

Just worked for me!刚刚为我工作！

Take a look in the mask_polygon_over_image function:看看mask_polygon_over_image function：

   #
def mask_polygon_over_image(image, all_lists_poligon):
    new_image = Image.new(mode="RGB", size =(3000, 3000), color = (255, 255, 255))
    draw = ImageDraw.Draw(image,'RGBA')
    draw_new = ImageDraw.Draw(new_image)
    for list_polygon in all_lists_poligon:
        draw.polygon(list_polygon, fill=(50, 255, 50, 105),outline="yellow")
        draw_new.polygon(list_polygon, fill=(0, 185, 0),outline="yellow")
    return image, new_image

带有 Tensorflow 的多边形边界框

问题描述

2 个解决方案

解决方案1
0 2018-01-23 03:02:46

解决方案2
0 2022-09-13 19:43:37

带有 Tensorflow 的多边形边界框

问题描述

2 个解决方案

解决方案1 0 2018-01-23 03:02:46

解决方案2 0 2022-09-13 19:43:37

解决方案1
0 2018-01-23 03:02:46

解决方案2
0 2022-09-13 19:43:37