简体   繁体   English

如何在 python OpenCV 中逐个像素地有效循环图像?

[英]How to efficiently loop over an image pixel by pixel in python OpenCV?

What I want to do is to loop over an image pixel by pixel using each pixel value to draw a circle in another corresponding image.我想要做的是使用每个像素值逐个像素地循环图像以在另一个相应图像中绘制一个圆圈。 在此处输入图像描述 My approach is as follows:我的做法如下:

it = np.nditer(pixels, flags=['multi_index'])
while not it.finished:
    y, x = it.multi_index
    color = it[0]
    it.iternext()
    center = (x*20 + 10, y*20 + 10) # corresponding circle center
    cv2.circle(circles, center, int(8 * color/255), 255, -1)

Looping this way is somewhat slow.以这种方式循环有点慢。 I tried adding the @njit decorator of numba, but apparently it has problems with opencv.我尝试添加 numba 的 @njit 装饰器,但显然它与 opencv 有问题。

Input images are 32x32 pixels They map to output images that are 32x32 circles each circle is drawn inside a 20x20 pixels square That is, the output image is 640x640 pixels输入图像是 32x32 像素它们 map 到 output 图像是 32x32 圆圈每个圆圈绘制在 20x20 像素正方形内也就是说,output 图像是 640x640 像素

A single image takes around 100ms to be transformed to circles, and I was hoping to lower that to 30ms or lower一张图片转换成圆形大约需要 100 毫秒,我希望将其降低到 30 毫秒或更短

Any recommendations?有什么建议吗?

When:什么时候:

  • Dealing with drawings处理图纸
  • The number of possible options does not exceed a common sense value (in this case: 256)可能选项的数量不超过一个常识值(在本例中:256)
  • Speed is important (I guess that's always the case)速度很重要(我想情况总是如此)
  • There's no other restriction preventing this approach没有其他限制阻止这种方法

the best way would be to "cache" the drawings (draw them upfront (or on demand depending on the needed overhead) in another array), and when the drawing should normally take place, simply take the appropriate drawing from the cache and place it in the target area (as @ChristophRackwitz stated in one of the comments), which is a very fast NumPy operation (compared to drawing).最好的方法是“缓存”图纸(预先绘制它们(或根据所需的开销按需绘制)在另一个数组中),并且当绘图通常应该发生时,只需从缓存中取出适当的绘图并将其放置在目标区域(如@ChristophRackwitz 在其中一条评论中所述),这是一个非常快的NumPy操作(与绘图相比)。

As a side note, this is a generic method not necessarily limited to drawings.作为旁注,这是一种通用方法,不一定限于图纸。

But the results you claim you're getting: ~100 ms per one 32x32 image (to a 640x640 circles one), didn't make any sense to me (as OpenCV is also fast, and 1024 circles shouldn't be such a big deal), so I created a program to convince myself.但是你声称你得到的结果:每张32x32图像约 100 毫秒(对于640x640圆圈),对我来说没有任何意义(因为OpenCV也很快,1024 圈子不应该这么大交易),所以我创建了一个程序来说服自己。

code00.py :代码00.py

#!/usr/bin/env python

import itertools as its
import sys
import time

import cv2
import numpy as np


def draw_img_orig(arr_in, arr_out, *args):
    factor = round(arr_out.shape[0] / arr_in.shape[0])
    factor_2 = factor // 2
    it = np.nditer(arr_in, flags=["multi_index"])
    while not it.finished:
        y, x = it.multi_index
        color = it[0]
        it.iternext()
        center = (x * factor + factor_2, y * factor + factor_2) # corresponding circle center
        cv2.circle(arr_out, center, int(8 * color / 255), 255, -1)


def draw_img_regular_iter(arr_in, arr_out, *args):
    factor = round(arr_out.shape[0] / arr_in.shape[0])
    factor_2 = factor // 2
    for row_idx, row in enumerate(arr_in):
        for col_idx, col in enumerate(row):
            cv2.circle(arr_out, (col_idx * factor + factor_2, row_idx * factor + factor_2), int(8 * col / 255), 255, -1)


def draw_img_cache(arr_in, arr_out, *args):
    factor = round(arr_out.shape[0] / arr_in.shape[0])
    it = np.nditer(arr_in, flags=["multi_index"])
    while not it.finished:
        y, x = it.multi_index
        yf = y * factor
        xf = x *factor
        arr_out[yf: yf + factor, xf: xf + factor] = args[0][it[0]]
        it.iternext()


def generate_input_images(shape, count, dtype=np.uint8):
    return np.random.randint(256, size=(count,) + shape, dtype=dtype)


def generate_circles(shape, dtype=np.uint8, func=lambda x: int(8 * x / 255), color=255):
    ret = np.zeros((256,) + shape, dtype=dtype)
    cy = shape[0] // 2
    cx = shape[1] // 2
    for idx, arr in enumerate(ret):
        cv2.circle(arr, (cx, cy), func(idx), color, -1)
    return ret


def test_draw(imgs_in, img_out, count, draw_func, *draw_func_args):
    print("\nTesting {:s}".format(draw_func.__name__))
    start = time.time()
    for i, e in enumerate(its.cycle(range(imgs_in.shape[0]))):
        draw_func(imgs_in[e], img_out, *draw_func_args)
        if i >= count:
            break
    print("Took {:.3f} seconds ({:d} images)".format(time.time() - start, count))


def test_speed(shape_in, shape_out, dtype=np.uint8):
    imgs_in = generate_input_images(shape_in, 50, dtype=dtype)
    #print(imgs_in.shape, imgs_in)
    img_out = np.zeros(shape_out, dtype=dtype)
    circles = generate_circles((shape_out[0] // shape_in[0], shape_out[1] // shape_in[1]))
    count = 250
    test_draw(imgs_in, img_out, count, draw_img_orig)
    test_draw(imgs_in, img_out, count, draw_img_regular_iter)
    test_draw(imgs_in, img_out, count, draw_img_cache, circles)


def test_accuracy(shape_in, shape_out, dtype=np.uint8):
    img_in = np.arange(np.product(shape_in), dtype=dtype).reshape(shape_in)
    circles = generate_circles((shape_out[0] // shape_in[0], shape_out[1] // shape_in[1]))
    data = (
        (draw_img_orig, "orig.png", None),
        (draw_img_regular_iter, "regit.png", None),
        (draw_img_cache, "cache.png", circles),
    )
    imgs_out = [np.zeros(shape_out, dtype=dtype) for _ in range(len(data))]
    for idx, (draw_func, out_name, other_arg) in enumerate(data):
        draw_func(img_in, imgs_out[idx], other_arg)
        cv2.imwrite(out_name, imgs_out[idx])
    for idx, img in enumerate(imgs_out[1:], start=1):
        if not np.array_equal(img, imgs_out[0]):
            print("Image index different: {:d}".format(idx))


def main(*argv):
    dt = np.uint8
    shape_in = (32, 32)
    factor_io = 20
    shape_out = tuple(i * factor_io for i in shape_in)
    test_speed(shape_in, shape_out, dtype=dt)
    test_accuracy(shape_in, shape_out, dtype=dt)


if __name__ == "__main__":
    print("Python {:s} {:03d}bit on {:s}\n".format(" ".join(elem.strip() for elem in sys.version.split("\n")),
                                                   64 if sys.maxsize > 0x100000000 else 32, sys.platform))
    rc = main(*sys.argv[1:])
    print("\nDone.")
    sys.exit(rc)

Notes :注意事项

  • Besides your implementation that uses np.nditer (which I placed in a function called draw_img_orig ), I created 2 more:除了使用np.nditer的实现(我将其放置在名为draw_img_orig的 function 中)之外,我还创建了 2 个:

    • One that iterates the input array Python icly ( draw_img_regular_iter )一个迭代输入数组Python icly ( draw_img_regular_iter )

    • One that uses cached circles, and also iterates via np.nditer ( draw_img_cache )一个使用缓存的圆圈,也通过np.nditer ( draw_img_cache ) 进行迭代

  • In terms of tests, there are 2 of them - each being performed on every of the 3 (above) approaches:在测试方面,有 2 个 - 每个都在 3 个(以上)方法中的每一个上执行:

    • Speed: measure the time took to process a number of images速度:衡量处理大量图像所花费的时间

    • Accuracy: measure the output for a 32x32 input containing the interval [0, 255] (4 times)准确度:测量 output 包含区间[0, 255]32x32输入(4 次)

Output : Output :

 [cfati@CFATI-5510-0:e:\Work\Dev\StackOverflow\q071818080]> sopr.bat ### Set shorter prompt to better fit when pasted in StackOverflow (or other) pages ### [prompt]> dir /b code00.py [prompt]> "e:\Work\Dev\VEnvs\py_pc064_03.09_test0\Scripts\python.exe" code00.py Python 3.9.9 (tags/v3.9.9:ccb0e6a, Nov 15 2021, 18:08:50) [MSC v.1929 64 bit (AMD64)] 064bit on win32 Testing draw_img_orig Took 0.908 seconds (250 images) Testing draw_img_regular_iter Took 1.061 seconds (250 images) Testing draw_img_cache Took 0.426 seconds (250 images) Done. [prompt]> [prompt]> dir /b cache.png code00.py orig.png regit.png

Above there are the speed test results: as seen, your approach took a bit less than a second for 250 images!!!上面是速度测试结果:如图所示,您的方法对 250 张图像花费了不到一秒的时间!!! So I was right, I don't know where your slowness comes from, but it's not from here (maybe you got the measurements wrong?).所以我是对的,我不知道你的慢是从哪里来的,但它不是从这里来的(也许你测量错了?)。
The regular method is a bit slower, while the cached one is ~2X faster.常规方法有点慢,而缓存方法快约 2倍。
I ran the code on my laptop:我在笔记本电脑上运行代码:

  • Win 10 pc064赢 10 pc064
  • CPU : Intel i7 6820HQ @ 2.70GHz (fairly old) CPUIntel i7 6820HQ @ 2.70GHz (相当旧)
  • GPU : not relevant, as I didn't notice any spikes during execution GPU :不相关,因为我在执行期间没有注意到任何尖峰

Regarding the accuracy test, all (3) output arrays are identical (there's no message saying otherwise), here's one saved image:关于精度测试,所有(3)output arrays 是相同的(没有消息说不同),这是一张保存的图像:

img0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM