简体   繁体   中英

How to create own dataset for using Mask-RCNN models from the Tensorflow Object Detection API?

I do not quite understand this guide: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/instance_segmentation.md

I have many objects of three classes. According to the guide I have to make mask with dimension [N, H, W], where:

  • N - count of objects
  • H - image height
  • W - image width

I have this function to create a mask

def image_mask(img, polygons):
    w, h = img.size
    n = len(polygons)
    mask = np.zeros([n, h, w], dtype=np.float32)
    for i in range(0, n):
        polygon = polygons[i].reshape((-1, 1, 2))
        tmp_mask = np.zeros([h, w], dtype=np.float32)
        cv2.fillPoly(tmp_mask, [polygon], (1, 1, 1))
        mask[i, :, :] = tmp_mask
    return mask

I use this guide for creating my dataset: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/using_your_own_dataset.md

I add a mask to the end of tf_example

tf_example = tf.train.Example(features=tf.train.Features(feature={
...
      'image/object/class/label': dataset_util.int64_list_feature(classes),
      'image/object/mask': dataset_util.bytes_list_feature(mask.reshape((-1))),
  }))

Because of reshape (I suppose), RAM quickly runs out and I get a memory error. What am I doing wrong? Maybe somewhere there is a detailed guide, how to create a mask for using Mask-RCNN and Tensorflow Object Detection API? I did not find this.

This is an old question, but it looks like you aren't converting your mask data to bytes before sending it to a bytes_list_feature.

If there are still memory issues, the 'image/object/mask' feature can be a list of bytes strings, one for each object. If you have a very large n , the other option (a NxHxW array that must be manipulated after compilation) may cause memory issues.

Here's how to compile an instance map data into object masks using the bytes list option:

# a HxW array of integer instance IDs, one per each unique object
inst_map_data = func_to_get_inst_map_data(inst_map_path)
                    
object_masks = []
class_labels = []
for inst_id in np.unique(inst_map_data): # loop through all objects

    # a HxW array of 0's and 1's, 1's representing where this object is
    obj_mask_data = np.where(inst_map_data==inst_id, 1, 0)
                
    # encode as png for space saving
    is_success, buffer = cv2.imencode(".png", obj_mask_data)
    io_buf = io.BytesIO(buffer)

    # a bytes string
    obj_mask_data_bytes = io_buf.getvalue()
                        
    object_masks += [obj_mask_data_bytes]
    class_labels += [int((inst_id-inst_id%1000)/1000)]

tf_example = tf.train.Example(features=tf.train.Features(feature={
...
      'image/object/class/label': dataset_util.int64_list_feature(class_labels),
      'image/object/mask': dataset_util.bytes_list_feature(object_masks),
  }))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM