简体   繁体   中英

How to convert 2D bounding box pixel coordinates (x, y, w, h) into relative coordinates (Yolo format)?

Hy! I am annotating image data through an online plateform which is generating output coordinates like this: bbox":{"top":634,"left":523,"height":103,"width":145} However, i want to use this annotation to train Yolo. So, I have to convert it in yolo format like this: 4 0.838021 0.605556 0.177083 0.237037

In this regard, i need help about how to convert it.

Here, For the size you need to pass the (w,h) and the for the box you need to pass (x,x+w, y, y+h) https://github.com/ivder/LabelMeYoloConverter/blob/master/convert.py

def convert(size, box):
    dw = 1./size[0]
    dh = 1./size[1]
    x = (box[0] + box[1])/2.0
    y = (box[2] + box[3])/2.0
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x*dw
    w = w*dw
    y = y*dh
    h = h*dh
    return (x,y,w,h)

Alternatively, you can use below

def convert(x,y,w,h):
 dw = 1.0/w
 dh = 1.0/h
 x = (2*x+w)/2.0
 y = (2*y+w)/2.0
 x = x*dw
 y = y*dh
 w = w*dw
 h = h*dh
 return (x,y,w,h)

Each grid cell predicts B bounding boxes as well as C class probabilities. The bounding box prediction has 5 components: (x, y, w, h, confidence). The (x, y) coordinates represent the center of the box, relative to the grid cell location (remember that, if the center of the box does not fall inside the grid cell, than this cell is not responsible for it). These coordinates are normalized to fall between 0 and 1. The (w, h) box dimensions are also normalized to [0, 1], relative to the image size. Let's look at an example:

What does the coordinate output of yolo algorithm represent?

Convert bbox dictionary into list with relative coordinates

If you want to convert a python dictionary with the keys top , left , widht , height into a list in the format [ x1 , y1 , x2 , y2 ]

Where x1 , y1 are the relative coordinates of the top left corner of the bounding box and x2 , y2 are the relative coordinates of the bottom right corner of the bounding box you can use the following function :

def bbox_dict_to_list(bbox_dict, image_size):
  h = bbox_dict.get('height')
  l = bbox_dict.get('left')
  t = bbox_dict.get('top')
  w = bbox_dict.get('width')

  img_w, img_h = image_size

  x1 = l/img_w
  y1 = t/img_h
  x2 = (l+w)/img_w
  y2 = (t+h)/img_h
  return [x1, y1, x2, y2]

You must pass as arguments the bbox dictionary, and the image size as a tuple -> (image_width, image_height)

Example

bbox = {"top":634,"left":523,"height":103,"width":145} 
bbox_dict_to_list(bbox, (1280, 720))
>> [0.40859375, 0.8805555555, 0.521875, 1.02361111111]

You can change the return order to suit your needs

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM