简体   繁体   English

COWC 数据集注释

[英]COWC Dataset annotation

I'm new to deep learning. 我是深度学习的新手。 Currently, I am doing a project to detect cars in aerial imagery using the Retinanet model for that I have planned to use COWC Dataset. 目前,我正在做一个项目,使用 Retinanet 模型检测航拍图像中的汽车,因为我计划使用 COWC 数据集。 I have doubt in the annotation part, for now I am using labelImg annotation tool to annotate cars in aerial images. 我对注释部分有疑问,现在我使用 labelImg 注释工具来注释航拍图像中的汽车。 Since labelImg generates annotation in xml format I have converted that in a format required by Retinanet model that is mentioned below. 由于 labelImg 以 xml 格式生成注释,因此我已将其转换为下面提到的 Retinanet 模型所需的格式。

(imagename) (bounding_box_coordinates) (class_name) (imagename) (bounding_box_coordinates) (class_name)

Is there any other way to make annotation easier in COWC dataset?有没有其他方法可以使 COWC 数据集中的注释更容易?

Thanks in advance:)提前致谢:)

The COWC dataset comes with annotations where each cars is labeled with a single point. COWC 数据集带有注释,其中每辆车都标有一个点。 A PNG file contains the annotations. PNG 文件包含注释。 Here's how I find the annotation locations in the PNG file.这是我在 PNG 文件中找到注释位置的方法。

import numpy as np
from PIL import Image

annotation_path = 'cowc/datasets/ground_truth_sets/Toronto_ISPRS/03553_Annotated_Cars.png'
im = Image.open(annotation_path)
data = np.asarray(im)

The problem here is that both of these values will be indexed as nonzero but we only need one of them.这里的问题是这两个值都将被索引为非零,但我们只需要其中一个。 The COWC dataset marks cars with a red dot and negative with a blue dot, we don't need the alpha channel so the new array needs be sliced so that we don't count the alpha channel and get duplicate index values. COWC 数据集用红点标记汽车,用蓝点标记负数,我们不需要 alpha 通道,因此需要对新数组进行切片,以便我们不计算 alpha 通道并获得重复的索引值。

data = data[:,:,0:3]
y_ind, x_ind, rgba_ind = data.nonzero()

You now have an index to all the points in the annotation file.您现在拥有注释文件中所有点的索引。 y_ind corresponds to the height dimension, x_ind to the width. y_ind对应于高度尺寸, x_ind到的宽度。 This means at the first x, y position we should see an array that looks like this [255, 0, 0] .这意味着在第一个 x, y 位置,我们应该看到一个类似于[255, 0, 0]的数组。 This is what I get when I look up the first x, y position from the index这是我从索引中查找第一个 x, y 位置时得到的结果

>>> data[y_ind[0], x_ind[0]]
array([255,   0,   0], dtype=uint8)

Here the author decides to create a bounding box that is 20 pixels on a side centered on the annotation provided in the dataset. 在这里,作者决定在以数据集中提供的注释为中心的一侧创建一个 20 像素的边界框。 To create a single bounding box for the first annotation in this image you can try this.要为此图像中的第一个注释创建单个边界框,您可以尝试此操作。

# define bbox given x, y and ensure bbox is within image bounds
def get_bbox(x, y, x_max, y_max):
    x1 = max(0, x - 20)     # returns zero if x-20 is negative
    x2 = min(x_max, x + 20) # returns x_max if x+20 is greater than x_max
    y1 = max(0, y - 20)
    y2 = min(y_max, y + 20)
    return x1, y1, x2, y2

x1, y1, x2, y2 = get_bbox(x_ind[0], y_ind[0], im.width, im.height) 

You'll have to loop through all the x, y values to make all the bounding boxes for the image.您必须遍历所有 x、y 值以制作图像的所有边界框。 Here's a rough and dirty way to create a csv file for a single image.这是为单个图像创建 csv 文件的粗略和肮脏的方法。

img_path = 'cowc/datasets/ground_truth_sets/Toronto_ISPRS/03553.png'
with open('anno.csv', 'w') as f:
    for x, y in zip(x_ind, y_ind):
        x1, y1, x2, y2 = get_bbox(x, y, im.width, im.height)
        line = f'{img_path},{x1},{y1},{x2},{y2},car\n'
        f.write(line)

I plan on breaking up a huge image into much smaller ones which will change the values of the bounding boxes.我计划将一个巨大的图像分解成更小的图像,这将改变边界框的值。 I hope you find this helpful and like a good place to start.我希望你觉得这很有帮助,并喜欢一个很好的起点。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM