简体   繁体   English

从掩码 rcnn 中提取分割掩码

[英]extract segmentation masks from mask rcnn

I'm training a model to recognize hands and want to extract the segmentation masks after detection using the matterport MRCNN ( https://github.com/matterport/Mask_RCNN ):我正在训练 model 来识别手,并希望在使用 matterport MRCNN( https://github.com/matterport/Mask_RCNN )检测后提取分割掩码:

model= mrcnn.model.MaskRCNN(mode="inference", 
                             config=SimpleConfig(),
                             model_dir=os.getcwd())



model.load_weights( filepath="mask_rcnn_0028.h5", 
                   by_name=True)


image = cv2.imread("CARDS_COURTYARD.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

results = model.detect([image], verbose=0)

r = results[0] 

mrcnn.visualize.display_instances(image=image, 
                                  boxes=r['rois'], 
                                  masks=r['masks'], 
                                  class_ids=r['class_ids'], 
                                  class_names=CLASS_NAMES, 
                                  scores=r['scores'])

Here is an example detection:这是一个示例检测:

MaskRCNN hands detection output image MaskRCNN手部检测output图像

After detection, I reshape the masks boolean array (saved in the model as r['masks']) so I can access each segmentation mask individually (masks[0] being masks of the first class id, in this case 'yourright'), and save each array as an image:检测后,我重塑掩码 boolean 数组(保存在 model 中作为 r['masks']),以便我可以单独访问每个分段掩码(掩码 [0] 是第一个 ZA2F2ED4F8EBC2CBB64C21A29DC 的掩码) ,并将每个数组保存为图像:

masks=r['masks']

masks = masks.reshape(2, 720, 1280)

im = Image.fromarray(masks[0])
im.save("mask.jpeg")

My output from this is:我的 output 是:

'youright' segmentation mask 'youright' 分割掩码

Whilst this is the shape of the segmentation mask, and the dimensions are the same as the original image, the output image is not the segmentation as it appears in the original image.虽然这是分割掩码的形状,并且尺寸与原始图像相同,但 output 图像不是原始图像中出现的分割。 I am looking for the extracted masks to be output as they are overlayed on the original image, and not 'zoomed-in' as they are currently.我正在寻找提取的掩码为 output,因为它们覆盖在原始图像上,而不是像当前那样“放大”。 I assumed because the masks array held the same dimensions of the original image that the masks would retain their position, but apparently not.我假设因为掩码数组具有与原始图像相同的尺寸,因此掩码将保留其 position,但显然不是。 How can I output the segmentation masks as they appear in the original image?我怎样才能 output 分割掩码出现在原始图像中?

cheers干杯

Figured out the solution myself– posting it here in case anyone else runs into the same issue...自己想出了解决方案——把它贴在这里,以防其他人遇到同样的问题......

The problem is that I misunderstood how reshaping the array worked;问题是我误解了重塑数组的工作原理。 reshaping the third dimension to the first isn't a superficial change, but 'reshapes' the data entirely, therefore any extrapolated image is an entirely different shape, although I'm still unsure as to how the masks retained its general shape regardless.将第三个维度重塑为第一个维度并不是表面上的变化,而是完全“重塑”数据,因此任何外推图像都是完全不同的形状,尽管我仍然不确定面具如何保持其一般形状无论如何。 Reshaping the data, as I had done, is entirely unneeded as you can call upon each dimension irrespective of its position.正如我所做的那样,完全不需要重塑数据,因为您可以调用每个维度,而不管其 position。 I previously thought that to call upon the 3rd dimension only it has to be reshaped to appear as the first:我以前认为,要调用第 3 维,只需将其重新塑造为第一个:

masks = masks.reshape(2, 720, 1280)
im = Image.fromarray(masks[0])

Changing the shape in this way reorganises the data and distorts the image.以这种方式更改形状会重新组织数据并扭曲图像。 You can easily specify which dimension to call upon with:您可以轻松地指定调用哪个维度:

im = Image.fromarray(masks[:,:,0])

In this case, I'm accessing the first (0) layer of the 3rd dimension of the array.在这种情况下,我正在访问数组第三维的第一 (0) 层。

converting this to an image produces the mask as seen in the detection image:将其转换为图像会生成检测图像中所见的掩码:

[yourright detection][1] [1]: https://i.stack.imgur.com/ewMY3.jpg [你的检测][1][1]:https://i.stack.imgur.com/ewMY3.jpg

An easy mistake to make, especially if, like me, you are extremely new to python!一个容易犯的错误,特别是如果你像我一样对 python 非常陌生!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 为 Mask-RCNN 加载掩码(VGG Image Annotator)的问题 - Problem with loading masks (VGG Image Annotator) for Mask-RCNN 使用 Mask-RCNN 的不准确蒙版:楼梯效果和突然停止 - Inaccurate masks with Mask-RCNN: Stairs effect and sudden stops 如何堆叠多个二进制掩码以创建用于多类分割的单个掩码? - How to stack multiple binary masks to create a single mask for multiclass segmentation? 从Mask_RCNN张量检索信息 - Retrieving information from a Mask_RCNN Tensor 提取检测到的近似形状和边界框Mask RCNN - Extract the detected the approximate shape and bounding box Mask RCNN 掩码 rcnn 的数据注释 - Data annotation for mask rcnn loss为nan,训练Mask-RCNN多类分割时停止训练 - Loss is nan, stopping training when training Mask-RCNN multi-class segmentation 将注释从 Mask-RCNN 数据集格式转换为 COCO 格式 - Converting the annotations to COCO format from Mask-RCNN dataset format 从Mask-RCNN中的Mask矩阵(布尔矩阵)中找到mask(矩形)角的坐标? - Finding coordinates of corners of the mask(rectengular shape) from Mask matrix(Boolean matrix) in Mask-RCNN? Python:从 2d 蒙版创建 3d 蒙版 - Python: Create a 3d mask from 2d masks
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM