简体   繁体   English

如何遍历图像绘制边界框?

[英]How to traverse through image drawing bounding boxes?

I want to traverse through an image and draw bounding boxes in the image, and do some calculations with the submatrix of the image.我想遍历图像并在图像中绘制边界框,并使用图像的子矩阵进行一些计算。 I am trying to make the following code in C++ work in Python ( taken from the answer here ).我正在尝试使以下 C++ 代码在 Python 中工作( 取自这里的答案)。

for (int y = 0; y<resizedImage.cols - 32; y += 32) {
    for (int x = 0; x<resizedImage.rows - 32; x += 32) {
        // get the average for the whole 32x32 block
        Rect roi(x, y, 32, 32);
        Scalar mean, dev;
        meanStdDev(resizedImage(roi), mean, dev); // mean[0] is the mean of the first channel, gray scale value;
    }
}

I want to calculate the mean and print the ROI.我想计算平均值并打印投资回报率。 This is my code in Python using Pillow.这是我在 Python 中使用 Pillow 的代码。 The image I used for my code is here .我用于代码的图像在这里

image = Image.open(path)
draw = ImageDraw.Draw(image)
step = 64
original_rows, original_cols = image.size
rows = original_rows + step
cols = original_cols + step
image_arr = np.asarray(image)

for row in range(0, rows, step):
    if row <= rows - step:
        for col in range(0, cols, step):
            if col <= cols - step:
                box = (col,row,step,step)
                region = image.crop(box)
                print(np.asarray(region))
                draw.rectangle([col,row,step,step], width = 1, outline="#FFFFFF")
image.show()

Since the image is 256 x 256 and my step is 64 , I am expecting to print 16 regions, but it only prints the first one and the rest seem to be empty (look at the size of the Pillow object).由于图像是256 x 256并且我的步骤是64 ,我期望打印 16 个区域,但它只打印第一个区域,其余的似乎是空的(看看 Pillow 对象的大小)。 I also do not understand why it prints it 24 times ( <PIL.Image.Image> ), while I am expecting 16. Here is my output:我也不明白为什么它打印 24 次( <PIL.Image.Image> ),而我期待 16 次。这是我的输出:

[[[255   0   0 255]
  [255   0   0 255]
  [255   0   0 255]
  ...
  [255   0   0 255]
  [255   0   0 255]
  [255   0   0 255]]]]

<PIL.Image.Image image mode=RGBA size=0x64 at 0x11937F5F8>
<PIL.Image.Image image mode=RGBA size=0x64 at 0x10E9A4748>
<PIL.Image.Image image mode=RGBA size=0x64 at 0x11937F3C8>
<PIL.Image.Image image mode=RGBA size=0x64 at 0x1193618D0>
<PIL.Image.Image image mode=RGBA size=64x0 at 0x11937F5F8>
<PIL.Image.Image image mode=RGBA size=0x0 at 0x10E9A4748>
<PIL.Image.Image image mode=RGBA size=0x0 at 0x11937F3C8>
<PIL.Image.Image image mode=RGBA size=0x0 at 0x1193618D0>
<PIL.Image.Image image mode=RGBA size=0x0 at 0x11937F5F8>
<PIL.Image.Image image mode=RGBA size=64x0 at 0x10E9A4748>
<PIL.Image.Image image mode=RGBA size=0x0 at 0x11937F3C8>
<PIL.Image.Image image mode=RGBA size=0x0 at 0x1193618D0>
<PIL.Image.Image image mode=RGBA size=0x0 at 0x11937F5F8>
<PIL.Image.Image image mode=RGBA size=0x0 at 0x10E9A4748>
<PIL.Image.Image image mode=RGBA size=64x0 at 0x11937F3C8>
<PIL.Image.Image image mode=RGBA size=0x0 at 0x1193618D0>
<PIL.Image.Image image mode=RGBA size=0x0 at 0x11937F5F8>
<PIL.Image.Image image mode=RGBA size=0x0 at 0x10E9A4748>
<PIL.Image.Image image mode=RGBA size=0x0 at 0x11937F3C8>
<PIL.Image.Image image mode=RGBA size=64x0 at 0x1193618D0>
<PIL.Image.Image image mode=RGBA size=0x0 at 0x11937F5F8>
<PIL.Image.Image image mode=RGBA size=0x0 at 0x10E9A4748>
<PIL.Image.Image image mode=RGBA size=0x0 at 0x11937F3C8>
<PIL.Image.Image image mode=RGBA size=0x0 at 0x1193618D0>

Following the answer here , I understood that I needed to turn the image into a NumPy array straight after I open the image, however, it does not help.按照这里的答案,我明白我需要在打开图像后直接将图像转换为 NumPy 数组,但是,它没有帮助。

What am I doing wrong?我究竟做错了什么? I would appreciate any help.我将不胜感激任何帮助。

EDIT: I made it work using the NumPy array.编辑:我使用 NumPy 数组使它工作。 I still do not understand why and how using Pillow's crop didn't work.我仍然不明白为什么以及如何使用 Pillow 的作物不起作用。

image = Image.open(path)
step = 64
rows, cols = image.size
image_arr = np.asarray(image) #Added this

for row in range(0, rows, step):
    for col in range(0, cols, step):
          roi = image_arr[row:row+step, col:col+step] #Added this instead of using Pillow
          print(np.mean(roi))

I'm wondering, why you're using PIL at all, especially your code source is OpenCV based and you need to handle NumPy arrays anyway.我想知道,你为什么要使用 PIL,尤其是你的代码源是基于 OpenCV 的,无论如何你都需要处理 NumPy 数组。

That'd be my solution:那将是我的解决方案:

import cv2
import numpy as np

# Read input image; create additional output image to draw on
image = cv2.imread('ZsyOG.png')
image_out = image.copy()

# Parameters
step = 64
cols, rows = image.shape[:2]

# Actual processing in loop
i_region = 0
for row in np.arange(0, rows, step):
    for col in np.arange(0, cols, step):
        mean = cv2.mean(image[row:row+step, col:col+step])
        image_out = cv2.rectangle(img=image_out,
                                  pt1=(row, col),
                                  pt2=(row + step, col + step),
                                  color=(255, 255, 255),
                                  thickness=1)
        image_out = cv2.putText(img=image_out,
                                text=str(i_region),
                                org=(int(col+1/2*step), int(row+1/2*step)),
                                fontFace=cv2.FONT_HERSHEY_COMPLEX_SMALL,
                                fontScale=1.0,
                                color=(255, 255, 255))
        print('Region: ', i_region, '| Mean: ', mean)
        i_region += 1

cv2.imshow('image_out', image_out)
cv2.waitKey(0)
cv2.destroyAllWindows()

Output image:输出图像:

输出图像

Print output:打印输出:

Region:  0 | Mean:  (0.0, 0.0, 255.0, 0.0)
Region:  1 | Mean:  (0.0, 0.0, 255.0, 0.0)
Region:  2 | Mean:  (0.0, 255.0, 255.0, 0.0)
Region:  3 | Mean:  (0.0, 255.0, 255.0, 0.0)
Region:  4 | Mean:  (0.0, 0.0, 255.0, 0.0)
Region:  5 | Mean:  (0.0, 0.0, 255.0, 0.0)
Region:  6 | Mean:  (0.0, 255.0, 255.0, 0.0)
Region:  7 | Mean:  (0.0, 255.0, 255.0, 0.0)
Region:  8 | Mean:  (0.0, 0.0, 0.0, 0.0)
Region:  9 | Mean:  (0.0, 0.0, 0.0, 0.0)
Region:  10 | Mean:  (255.0, 0.0, 0.0, 0.0)
Region:  11 | Mean:  (255.0, 0.0, 0.0, 0.0)
Region:  12 | Mean:  (0.0, 0.0, 0.0, 0.0)
Region:  13 | Mean:  (0.0, 0.0, 0.0, 0.0)
Region:  14 | Mean:  (255.0, 0.0, 0.0, 0.0)
Region:  15 | Mean:  (255.0, 0.0, 0.0, 0.0)

Hope that helps!希望有帮助!

----------------------------------------
System information
----------------------------------------
Platform:    Windows-10-10.0.16299-SP0
Python:      3.8.1
NumPy:       1.18.1
OpenCV:      4.2.0
----------------------------------------

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 查找和绘制边界框 - finding and drawing bounding boxes 使用 Pytesseract / OpenCV 绘制边界框 - Drawing bounding boxes with Pytesseract / OpenCV 如何获得图像的 4 x 4 边界框及其相应的坐标? - How to get 4 by 4 bounding boxes of an image and their corresponding coordinaates? 给定具有多个边界框的图像,我如何仅突出显示完全在另一个边界框内的那些边界框? - Given an image with several bounding boxes, how do I highlight only those bounding boxes that are completely inside another bounding box? 使用python在视频流上绘制边界框 - Drawing bounding boxes on video streams using python 在灰度图像上创建边界框 - Creating bounding boxes on grayscale image 显示带有边框的图像时出现问题 - Trouble Displaying an Image with Bounding Boxes 如何从图像中获取随机边界框? (蟒蛇) - How to get random bounding boxes from image? (python) 如何在图像中的多个矩形边界框内应用阈值? - How to apply threshold within multiple rectangular bounding boxes in an image? 如何使用 pytesseract 从图像中的特定边界框中提取文本? - How to extract text from specific bounding boxes in an image using pytesseract?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM