[英]Remove colors from an image if they are not in a list python
I have a relatively large RGBA image (converted to numpy) that I need to replace all colors which do not appear in a list. 我有一个相对较大的RGBA图像(转换为numpy),我需要替换列表中没有出现的所有颜色。 How could I do this in a pythonic fast way?
我怎么能用pythonic快速方式做到这一点?
Using simple iteration I have a solution to this problem, however due to the images being quite large (2500 x 2500) this process is very slow. 使用简单的迭代我有一个解决这个问题的方法,但由于图像非常大(2500 x 2500),这个过程非常慢。
# Keep only these colors in the image, otherwise replace with (0,255,0,255)
palette = [[0,0,0,255],[0, 255, 0,255], [255, 0, 0,255], [128, 128, 128,255], [0, 0, 255,255], [255, 0, 255,255], [0, 255, 255,255], [255, 255, 255,255], [128, 128, 0,255], [0, 128, 128,255], [128, 0, 128,255]]
# Current slow solution with a 2500 x 2500 x 4 array (mask)
for z in range(mask.shape[0]):
for y in range(mask.shape[1]):
if (mask[z,y,:].tolist() not in palette):
mask[z, y] = (0,255,0,255)
Expected operating time per image: less than half a minute 每张图像的预计操作时间:不到半分钟
Current time: two minutes 当前时间:两分钟
That's definitely not some time windows you should be looking at. 这绝对不是你应该关注的时间窗口。 Here's an approach with
broadcasting
: 这是一种
broadcasting
方法:
# palette.shape == (4,11)
palette = np.array(palette).transpose()
# sample a.shape == (2,2,4)
a= np.array([[[ 28, 231, 203, 235],
[255, 0, 0,255]],
[[ 50, 152, 36, 151],
[252, 43, 63, 25]]])
# mask
# all(2) force all channels to be equal
# any(-1) matches any color
mask = (a[:,:,:, None] == palette).all(2).any(-1)
# replace color
rep_color = np.array([0,255,0,255])
# np.where to the rescue:
ret = np.where(mask[:,:,None], a, rep_color[None,None,:])
The sample: 例子:
becomes 变
and for a = np.random.randint(0,256, (2500,2500,4))
, it takes: 对于
a = np.random.randint(0,256, (2500,2500,4))
,它需要:
5.26 s ± 179 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
每循环5.26 s±179 ms(7次运行的平均值±标准偏差,每次1次循环)
Update: if you forces everything to be np.uint8
you can merge the channels to an int32
and get even faster speed: 更新:如果强制所有内容都是
np.uint8
您可以将通道合并到int32
并获得更快的速度:
a = np.random.randint(0,256, (2500,2500,4), dtype=np.uint8)
p = np.array(palette, dtype=np.uint8).transpose()
# zip the data into 32 bits
# could be even faster if we handle the memory directly
aa = a[:,:,0] * (2**24) + a[:,:,1]*(2**16) + a[:,:,2]*(2**8) + a[:,:,3]
pp = p[0]*(2**24) + p[1]*(2**16) + p[2]*(2**8) + p[3]
mask = (aa[:,:,None]==pp).any(-1)
ret = np.where(mask[:,:,None], a, rep_color[None,None,:])
which takes: 需要:
1.34 s ± 29.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
每循环1.34 s±29.7 ms(7次运行的平均值±标准偏差,每次1次循环)
i had a go with pyvips . 我跟pyvips一起去了。 This is a threaded, streaming image processing library, so it's fast and doesn't need much memory.
这是一个线程化的流式图像处理库,因此速度快,不需要太多内存。
import sys
import pyvips
from functools import reduce
# Keep only these colors in the image, otherwise replace with (0,255,0,255)
palette = [[0,0,0,255], [0, 255, 0,255], [255, 0, 0,255], [128, 128, 128,255], [0, 0, 255,255], [255, 0, 255,255], [0, 255, 255,255], [255, 255, 255,255], [128, 128, 0,255], [0, 128, 128,255], [128, 0, 128,255]]
im = pyvips.Image.new_from_file(sys.argv[1], access="sequential")
# test our image against each sample ... bandand() will AND all image bands
# together, ie. we want pixels where they all match
masks = [(im == colour).bandand() for colour in palette]
# OR all the masks together to find pixels which are in the palette
mask = reduce((lambda x, y: x | y), masks)
# pixels not in the mask become [0, 255, 0, 255]
im = mask.ifthenelse(im, [0, 255, 0, 255])
im.write_to_file(sys.argv[2])
With a 2500x 2500 pixel PNG on this 2015 i5 laptop I see: 在2015 i5笔记本电脑上使用2500x 2500像素的PNG,我看到:
$ /usr/bin/time -f %M:%e ./replace-pyvips.py ~/pics/x.png y.png
55184:0.92
So a max of 55mb of memory, and 0.92s of elapsed time. 所以最大内存为55mb,经过时间为0.92s。
I tried Quang Hoang's excellent numpy version for comparison: 我试过Quang Hoang的优秀numpy版本进行比较:
p = np.array(palette).transpose()
# mask
# all(2) force all channels to be equal
# any(-1) matches any color
mask = (a[:,:,:, None] == p).all(2).any(-1)
# replace color
rep_color = np.array([0,255,0,255])
# np.where to the rescue:
a = np.where(mask[:,:,None], a, rep_color[None,None,:])
im = Image.fromarray(a.astype('uint8'))
im.save(sys.argv[2])
Run on the same 2500 x 2500 pixel image: 在相同的2500 x 2500像素图像上运行:
$ /usr/bin/time -f %M:%e ./replace-broadcast.py ~/pics/x.png y.png
413504:3.08
A peak of 410MB of memory, and 3.1s elapsed. 内存高达410MB,经过了3.1秒。
Both versions could be sped up further by comparing uint32, as Hoang says. Hoang说,通过比较uint32可以进一步加快这两个版本的速度。
Using this code I was able to substitute a randomly generated image of 2500 x 2500 anywhere between 33 to 37 seconds. 使用此代码,我能够在33到37秒之间的任意位置替换随机生成的2500 x 2500图像。 The method you had took my machine on between 51 to 57 seconds to execute.
您使用我的机器的方法在51到57秒之间执行。
mask = np.random.rand(2500,2500,4)
mask = np.floor(mask * 255)
palette = np.array([[0,0,0,255],[0, 255, 0,255], [255, 0, 0,255], [128, 128, 128,255], [0, 0, 255,255], [255, 0, 255,255], [0, 255, 255,255], [255, 255, 255,255], [128, 128, 0,255], [0, 128, 128,255], [128, 0, 128,255]])
default = np.array([0,255,0,255])
for z in range(mask.shape[0]):
for y in range(mask.shape[1]):
if not mask[z,y,:] in palette:
mask[z,y,:] = default
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.