[英]How to convert 2D bounding box pixel coordinates (x, y, w, h) into relative coordinates (Yolo format)?
Hy!嗨! I am annotating image data through an online plateform which is generating output coordinates like this: bbox":{"top":634,"left":523,"height":103,"width":145} However, i want to use this annotation to train Yolo. So, I have to convert it in yolo format like this: 4 0.838021 0.605556 0.177083 0.237037我正在通过在线平台注释图像数据,该平台生成这样的输出坐标: bbox":{"top":634,"left":523,"height":103,"width":145}但是,我想使用此注释来训练 Yolo。所以,我必须将其转换为 yolo 格式,如下所示: 4 0.838021 0.605556 0.177083 0.237037
In this regard, i need help about how to convert it.在这方面,我需要有关如何转换它的帮助。
Here, For the size you need to pass the (w,h) and the for the box you need to pass (x,x+w, y, y+h) https://github.com/ivder/LabelMeYoloConverter/blob/master/convert.py在这里,对于需要传递 (w,h) 的大小和需要传递的框 (x,x+w, y, y+h) https://github.com/ivder/LabelMeYoloConverter/blob /master/convert.py
def convert(size, box):
dw = 1./size[0]
dh = 1./size[1]
x = (box[0] + box[1])/2.0
y = (box[2] + box[3])/2.0
w = box[1] - box[0]
h = box[3] - box[2]
x = x*dw
w = w*dw
y = y*dh
h = h*dh
return (x,y,w,h)
Alternatively, you can use below或者,您可以在下面使用
def convert(x,y,w,h):
dw = 1.0/w
dh = 1.0/h
x = (2*x+w)/2.0
y = (2*y+w)/2.0
x = x*dw
y = y*dh
w = w*dw
h = h*dh
return (x,y,w,h)
Each grid cell predicts B bounding boxes as well as C class probabilities.每个网格单元预测 B 个边界框以及 C 类概率。 The bounding box prediction has 5 components: (x, y, w, h, confidence).边界框预测有 5 个分量:(x、y、w、h、置信度)。 The (x, y) coordinates represent the center of the box, relative to the grid cell location (remember that, if the center of the box does not fall inside the grid cell, than this cell is not responsible for it). (x, y) 坐标表示框的中心,相对于网格单元的位置(请记住,如果框的中心不在网格单元内,则该单元不对其负责)。 These coordinates are normalized to fall between 0 and 1. The (w, h) box dimensions are also normalized to [0, 1], relative to the image size.这些坐标被归一化为介于 0 和 1 之间。 (w, h) 框尺寸也被归一化为 [0, 1],相对于图像大小。 Let's look at an example:让我们看一个例子:
What does the coordinate output of yolo algorithm represent? yolo算法的坐标输出代表什么?
If you want to convert a python dictionary with the keys top
, left
, widht
, height
into a list in the format [ x1
, y1
, x2
, y2
]如果要将带有键top
、 left
、 widht
、 height
的 Python 字典转换为格式为 [ x1
, y1
, x2
, y2
] 的列表
Where x1
, y1
are the relative coordinates of the top left corner
of the bounding box and x2
, y2
are the relative coordinates of the bottom right corner
of the bounding box you can use the following function :其中x1
, y1
是相对坐标top left corner
边框和x2
, y2
是相对坐标bottom right corner
你可以用下面的函数边框:
def bbox_dict_to_list(bbox_dict, image_size):
h = bbox_dict.get('height')
l = bbox_dict.get('left')
t = bbox_dict.get('top')
w = bbox_dict.get('width')
img_w, img_h = image_size
x1 = l/img_w
y1 = t/img_h
x2 = (l+w)/img_w
y2 = (t+h)/img_h
return [x1, y1, x2, y2]
You must pass as arguments the bbox dictionary, and the image size as a tuple -> (image_width, image_height)您必须将 bbox 字典作为参数传递,并将图像大小作为元组传递 -> (image_width, image_height)
Example例子
bbox = {"top":634,"left":523,"height":103,"width":145}
bbox_dict_to_list(bbox, (1280, 720))
>> [0.40859375, 0.8805555555, 0.521875, 1.02361111111]
You can change the return order to suit your needs您可以更改退货单以满足您的需求
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.