我如何將帶有 openCV 的視頻 stream 放入我的 pytorch neural.network？

Question

我在Pytorch從零開始寫了YOLOv3。 如果我使用經過訓練的權重通過 model 發送圖像，它有點管用。 下一步是使用我的相機讓 YOLO 實時發揮它的魔力。

我認為正確的工作管道是捕捉視頻的單幀並將其提供給.network。 然后，將框寫在同一幀上。

checkpoint = torch.load("\my_checkpoint_40.pth.tar")
model = YOLOv3(in_channels = 3, num_classes = 20).to(config.DEVICE)
model.load_state_dict(checkpoint["state_dict"])
ip_camera = "http://192.168.1.70:4500/mjpegfeed?640x480"
outputFile = "yolo_out_py.avi"

這樣，我將權重加載到 .net 中。 然后，我寫了 function 來使用我的相機（它是我手機上的 droidCamera，因為在我的 PC 上我沒有任何相機設備，所以我使用移動設備的 ip）並且代碼本身有效：視頻出現在屏幕上。 outputFile 應該是寫入視頻的目標路徑。 問題是當我嘗試將單個框架加載到 .net 並執行該過程的 rest 時。

def streaming(model, thresh, iou_thresh, anchors, ip_camera):
    stream = cv2.VideoCapture(ip_camera)

    # Corrective actions printed in the even of failed connection.
    if stream.isOpened() is not True:
        print('Not opened.')
        print('Please ensure the following:')
        print('1. DroidCam is not running in your browser.')
        print('2. The IP address given is correct.')
    # Resizing the image to be in hte same dimension of the YOLOv3 Network
    width = 416
    height = 416
    # Connection successful. Proceeding to display video stream.
    while stream.isOpened() is True:
        # Capture frame-by-frame
        ret, f = stream.read()
        dim = (width, height)
        image = cv2.resize(f, dim, interpolation = cv2.INTER_AREA)
        cv2.imshow('frame', image)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
        for frame in image:
            model.eval()
            anchors = torch.tensor(anchors)
            anchors = anchors.to(config.DEVICE) 
            x = torch.tensor(frame)
            x = x.to("cuda")

           # from this line to the nms_boxes, it's the same code i used for plotting a single               image
            with torch.no_grad():
                out = model(x)
                bboxes = [[] for _ in range(x.shape[0])]
                for i in range(1):
                    batch_size, A, S, _, _ = out[i].shape
                    anchor = anchors[i]
                    boxes_scale_i = cells_to_bboxes(
                        out[i], anchor, S = S, is_preds = True
                    )
                    for idx, (box) in enumerate(boxes_scale_i):
                        bboxes[idx] += box

                model.train()

            for i in range(batch_size):
                nms_boxes = non_max_suppression(
                    bboxes[i], iou_threshold = iou_thresh, threshold = thresh, box_format =                   "midpoint",
                )
           # cells_to_boxes and non_max_suppression are functions that return boxes coordinates
           # and the "better" box

                #now it's time to write things on the frame
                frame = cv2.VideoWriter(outputFile, cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), 30,
                                        (round(stream.get(cv2.CAP_PROP_FRAME_WIDTH)), round(stream.get(cv2.CAP_PROP_FRAME_HEIGHT))))

                frame.write(nms_boxes)
    stream.release()
    cv2.destroyAllWindows()

代碼不起作用有幾個原因：model.eval() 給我一個錯誤：缺少 1 個必需的位置參數：'self'

然后，我有幾個錯誤，我認為是關於 stream 視頻的正確工作流程。 這是我第一次與 openCV 合作。

如果我刪除 model.eval()，則會出現另一個錯誤：

out = model(x)

這是回溯

Traceback (most recent call last):
  File "C:/Python_Project/YOLOV3/openVid.py", line 97, in <module>
    streaming(YOLOv3, 0.6, 0.6, config.ANCHORS, ip_camera)
  File "C:/Python_Project/YOLOV3/openVid.py", line 72, in streaming
    out = model(x)
  File "C:\Python_Project\YOLOV3\model.py", line 106, in __init__
    self.layers = self.create_conv_layers()
  File "C:\Python_Project\YOLOV3\model.py", line 141, in create_conv_layers
    CNNBlock(
  File "C:\Python_Project\YOLOV3\model.py", line 46, in __init__
    self.conv = nn.Conv2d(in_channels, out_channels, bias = not bn_act, **kwargs)
  File "C:\Users\Simone\anaconda3\envs\Pytorch\lib\site-packages\torch\nn\modules\conv.py", line 430, in __init__
    super(Conv2d, self).__init__(
  File "C:\Users\Simone\anaconda3\envs\Pytorch\lib\site-packages\torch\nn\modules\conv.py", line 83, in __init__
    if in_channels % groups != 0:
RuntimeError: Boolean value of Tensor with more than one value is ambiguous

我不知道接下來要做什么。 我看到我應該在 ONNX 中轉換 model，但我真的不知道該怎么做。 我在 inte.net 上找不到任何教程，我被困住了。 你可以幫幫我嗎？

Answer 1

根據錯誤消息， model不是 class 實例。 請注意，在回溯中，

out = model(x)

正在調用__init__ function。因此， model可能是YOLOV3而不是YOLOV3(...) 。 基於 init 簽名， x被視為in_channels ，並且作為x圖像，

RuntimeError: Boolean 具有多個值的 Tensor 值不明確

說得通。 這也解釋了.eval()錯誤。 除此之外，我相信您需要為您的框架添加一個批次維度（例如， x.unsqueeze(0) ），否則您會得到另一個錯誤。

我如何將帶有 openCV 的視頻 stream 放入我的 pytorch neural.network？

問題描述

1 個解決方案

解決方案1
1 已采納 2021-08-19 12:15:09

我如何將帶有 openCV 的視頻 stream 放入我的 pytorch neural.network？

問題描述

1 個解決方案

解決方案1 1 已采納 2021-08-19 12:15:09

解決方案1
1 已采納 2021-08-19 12:15:09