[英]Convert RGB image to index image
I want to convert a 3 channel RGB image to a index image with Python.我想用 Python 将 3 通道 RGB 图像转换为索引图像。 It's used for handling the labels of training a deep net for semantic segmentation.
它用于处理训练用于语义分割的深度网络的标签。 By index image I mean it has one channel and each pixel is the index, which should starts with zero.
索引图像我的意思是它有一个通道,每个像素都是索引,它应该从零开始。 And certainly they should have the same size.
当然,它们应该具有相同的尺寸。 The conversion is based on the following mapping in Python dict:
转换基于 Python dict 中的以下映射:
color2index = {
(255, 255, 255) : 0,
(0, 0, 255) : 1,
(0, 255, 255) : 2,
(0, 255, 0) : 3,
(255, 255, 0) : 4,
(255, 0, 0) : 5
}
I've implemented a naive function:我已经实现了一个天真的功能:
def im2index(im):
"""
turn a 3 channel RGB image to 1 channel index image
"""
assert len(im.shape) == 3
height, width, ch = im.shape
assert ch == 3
m_lable = np.zeros((height, width, 1), dtype=np.uint8)
for w in range(width):
for h in range(height):
b, g, r = im[h, w, :]
m_lable[h, w, :] = color2index[(r, g, b)]
return m_lable
The input im
is a numpy array created by cv2.imread()
.输入
im
是由cv2.imread()
创建的numpy数组。 However, this code is really slow.然而,这段代码真的很慢。 Since the
im
is in numpy array I firstly tried the ufunc
of numpy with something like this:由于
im
是numpy的阵列我首先尝试了ufunc
numpy的用是这样的:
RGB2index = np.frompyfunc(lambda x: color2index(tuple(x)))
indices = RGB2index(im)
But it turns out that the ufunc
takes only one element each time.但事实证明
ufunc
每次只需要一个元素。 I was unable to give the function three arguments(RGB value) one time.我一次无法为该函数提供三个参数(RGB 值)。
So is there any other ways to do the optimization?那么有没有其他方法可以进行优化? The mapping has not to be that way, if a more efficient data structure exists.
如果存在更有效的数据结构,则映射不必如此。 I noticed that the access of a Python dict dose not cost much time, but the casting from numpy array to tuple (which is hashable) does.
我注意到访问 Python dict 不会花费太多时间,但是从numpy 数组到元组(它是可散列的)的转换会花费很多时间。
PS: One idea I got is to implement a kernel in CUDA. PS:我得到的一个想法是在 CUDA 中实现一个内核。 But it would be more complicated.
但是会比较复杂。
UPDATA1: Dan Mašek's Answer works fine. UPDATA1: Dan Mašek 的回答工作正常。 But first we have to convert the RGB image to grayscale.
但首先我们必须将 RGB 图像转换为灰度。 It could be problematic when two colors have the same grayscale value.
当两种颜色具有相同的灰度值时,可能会出现问题。
I paste the working code here.我在这里粘贴工作代码。 Hope it could help others.
希望它可以帮助其他人。
lut = np.ones(256, dtype=np.uint8) * 255
lut[[255,29,179,150,226,76]] = np.arange(6, dtype=np.uint8)
im_out = cv2.LUT(cv2.cvtColor(im, cv2.COLOR_BGR2GRAY), lut)
What about this?那这个呢?
color2index = {
(255, 255, 255) : 0,
(0, 0, 255) : 1,
(0, 255, 255) : 2,
(0, 255, 0) : 3,
(255, 255, 0) : 4,
(255, 0, 0) : 5
}
def rgb2mask(img):
assert len(img.shape) == 3
height, width, ch = img.shape
assert ch == 3
W = np.power(256, [[0],[1],[2]])
img_id = img.dot(W).squeeze(-1)
values = np.unique(img_id)
mask = np.zeros(img_id.shape)
for i, c in enumerate(values):
try:
mask[img_id==c] = color2index[tuple(img[img_id==c][0])]
except:
pass
return mask
Then just call:然后只需调用:
mask = rgb2mask(ing)
actually for-loop takes much time.实际上for循环需要很多时间。
binary_mask = (im_array[:,:,0] == 255) & (im_array[:,:,1] == 255) & (im_array[:,:,2] == 0)
maybe above code can help you也许上面的代码可以帮助你
I've implemented a naive function: … I firstly tried the
ufunc
of numpy with something like this: …我已经实现了一个简单的函数:……我首先尝试了numpy的
ufunc
类似这样的东西:……
I suggest using an even more naive function which converts just one pixel:我建议使用一个更简单的函数,它只转换一个像素:
def rgb2index(rgb):
"""
turn a 3 channel RGB color to 1 channel index color
"""
return color2index[tuple(rgb)]
Then using a numpy routine is a good idea, but we don't need a ufunc
:然后使用numpy例程是个好主意,但我们不需要
ufunc
:
np.apply_along_axis(rgb2index, 2, im)
Here numpy.apply_along_axis()
is used to apply our rgb2index()
function to the RGB slices along the last of the three axes (0, 1, 2) for the whole image im
.这里
numpy.apply_along_axis()
用于将我们的rgb2index()
函数应用于整个图像im
的三个轴(0、1、2)中最后一个的 RGB 切片。
We could even do without the function and just write:我们甚至可以不用这个函数而只写:
np.apply_along_axis(lambda rgb: color2index[tuple(rgb)], 2, im)
Similar to what Armali and Mendrika proposed, I somehow had to tweak it a little bit to get it to work (maybe totally my fault).与 Armali 和 Mendrika 提出的类似,我不得不稍微调整一下才能让它工作(也许完全是我的错)。 So I just wanted to share a snippet that works.
所以我只想分享一个有效的片段。
COLORS = np.array([
[0, 0, 0],
[0, 0, 255],
[255, 0, 0]
])
W = np.power(255, [0, 1, 2])
HASHES = np.sum(W * COLORS, axis=-1)
HASH2COLOR = {h : c for h, c in zip(HASHES, COLORS)}
HASH2IDX = {h: i for i, h in enumerate(HASHES)}
def rgb2index(segmentation_rgb):
"""
turn a 3 channel RGB color to 1 channel index color
"""
s_shape = segmentation_rgb.shape
s_hashes = np.sum(W * segmentation_rgb, axis=-1)
func = lambda x: HASH2IDX[int(x)]
segmentation_idx = np.apply_along_axis(func, 0, s_hashes.reshape((1, -1)))
segmentation_idx = segmentation_idx.reshape(s_shape[:2])
return segmentation_idx
segmentation = np.array([[0, 0, 0], [0, 0, 255], [255, 0, 0]] * 3).reshape((3, 3, 3))
rgb2index(segmentation)
The code is also available here: https://github.com/theRealSuperMario/supermariopy/blob/dev/scripts/rgb2labels.py该代码也可在此处获得: https : //github.com/theRealSuperMario/supermariopy/blob/dev/scripts/rgb2labels.py
Did you check Pillow library https://python-pillow.org/ ?你检查枕头图书馆https://python-pillow.org/吗? As I remember, it has some classes and methods to deal with color conversion.
我记得,它有一些类和方法来处理颜色转换。 See: https://pillow.readthedocs.io/en/4.0.x/reference/Image.html#PIL.Image.Image.convert
请参阅: https : //pillow.readthedocs.io/en/4.0.x/reference/Image.html#PIL.Image.Image.convert
Here's a small utility function to convert images (np.array) to per-pixel labels (indices), which can also be a one-hot encoding:这是一个将图像(np.array)转换为每像素标签(索引)的小实用函数,它也可以是单热编码:
def rgb2label(img, color_codes = None, one_hot_encode=False):
if color_codes is None:
color_codes = {val:i for i,val in enumerate(set( tuple(v) for m2d in img for v in m2d ))}
n_labels = len(color_codes)
result = np.ndarray(shape=img.shape[:2], dtype=int)
result[:,:] = -1
for rgb, idx in color_codes.items():
result[(img==rgb).all(2)] = idx
if one_hot_encode:
one_hot_labels = np.zeros((img.shape[0],img.shape[1],n_labels))
# one-hot encoding
for c in range(n_labels):
one_hot_labels[: , : , c ] = (result == c ).astype(int)
result = one_hot_labels
return result, color_codes
img = cv2.imread("input_rgb_for_labels.png")
img_labels, color_codes = rgb2label(img)
print(color_codes) # e.g. to see what the codebook is
img1 = cv2.imread("another_rgb_for_labels.png")
img1_labels, _ = rgb2label(img1, color_codes) # use the same codebook
It calculates (and returns) the color codebook if None
is supplied.如果
None
提供,它会计算(并返回)颜色码本。
If you are happy using MATLAB - maybe saving the result as *.mat
and loading with scipy.io.loadmat
- there is the rgb2ind
function in MATLAB, which does exactly what you are asking for.如果您喜欢使用 MATLAB - 也许将结果保存为
*.mat
并使用scipy.io.loadmat
加载 - MATLAB 中有rgb2ind
函数,它完全符合您的要求。 If not, it could be used as inspiration for a similar implementation in Python.如果没有,它可以用作 Python 中类似实现的灵感。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.