简体   繁体   English

如何为detectron2内置模型做输入?

[英]How to do input for detectron2 builtinmodel?

I trained a model, now I would like to use it to detect objects in images.我训练了一个 model,现在我想用它来检测图像中的物体。 Using the DefaultDetector only the boundyboxes are returned, I would need the masks.使用 DefaultDetector 仅返回 boundyboxes,我需要掩码。 I saw that you can also perform inference with this method:我看到你也可以用这个方法进行推理:

model.eval()
with torch.no_grad():
    outputs = model(inputs)

I think that's what he should use.我认为这是他应该使用的。 The problem is that I don't know how to set the inputs, starting with images.问题是我不知道如何设置输入,从图像开始。

import torch
import glob
cfg = get_cfg()

cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/"
                                              "mask_rcnn_R_101_FPN_3x.yaml"))
cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth")
cfg.SOLVER.IMS_PER_BATCH = 1
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1 # only has one class
cfg.INPUT.FORMAT = "BGR"
#Just run these lines if you have the trained model im memory
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7 # set the testing threshold for this model
#build model
model = build_model(cfg)

DetectionCheckpointer(model).load("output/model_final.pth")

model.eval()#make sure its in eval mode

image = cv2.imread("/kaggle/working/detectron2/images/73-ab1.jpg")
height, width = image.shape[:2]
image = torch.as_tensor(image.astype("float32").transpose(2, 0, 1))
image = ImageList.from_tensors([image])

with torch.no_grad():
    inputs = image
    outputs = model(inputs)

Unfortunately, however, I think I'm wrong, can someone enlighten me?但是,不幸的是,我认为我错了,有人可以启发我吗?

See the Model Input Format for the builtin models.有关内置模型,请参阅Model 输入格式

Basically, the model in your code is not expecting an ImageList object, but a list of dict s where each dict needs to provide specific information about one image, as explained in the documentation linked above.基本上,代码中的 model 不需要ImageList object,而是一个dict list ,其中每个dict需要提供有关一个图像的特定信息,如上面链接的文档中所述。

So, your inference code needs to be corrected to the following.因此,您的推理代码需要更正为以下内容。

image = cv2.imread("/kaggle/working/detectron2/images/73-ab1.jpg")
height, width = image.shape[:2]
image = torch.as_tensor(image.astype("float32").transpose(2, 0, 1))
inputs = [{"image": image, "height": height, "width": width}]

with torch.no_grad():
    outputs = model(inputs)

You can also see this in the code - the forward method of the GeneralizedRCNN class .您还可以在代码中看到这一点 - GeneralizedRCNN class 的forward方法

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM