简体   繁体   English

将 RGB 图像转换为索引图像

[英]Convert RGB image to index image

I want to convert a 3 channel RGB image to a index image with Python.我想用 Python 将 3 通道 RGB 图像转换为索引图像。 It's used for handling the labels of training a deep net for semantic segmentation.它用于处理训练用于语义分割的深度网络的标签。 By index image I mean it has one channel and each pixel is the index, which should starts with zero.索引图像我的意思是它有一个通道,每个像素都是索引,它应该从零开始。 And certainly they should have the same size.当然,它们应该具有相同的尺寸。 The conversion is based on the following mapping in Python dict:转换基于 Python dict 中的以下映射:

color2index = {
        (255, 255, 255) : 0,
        (0,     0, 255) : 1,
        (0,   255, 255) : 2,
        (0,   255,   0) : 3,
        (255, 255,   0) : 4,
        (255,   0,   0) : 5
    }

I've implemented a naive function:我已经实现了一个天真的功能:

def im2index(im):
    """
    turn a 3 channel RGB image to 1 channel index image
    """
    assert len(im.shape) == 3
    height, width, ch = im.shape
    assert ch == 3
    m_lable = np.zeros((height, width, 1), dtype=np.uint8)
    for w in range(width):
        for h in range(height):
            b, g, r = im[h, w, :]
            m_lable[h, w, :] = color2index[(r, g, b)]
    return m_lable

The input im is a numpy array created by cv2.imread() .输入im是由cv2.imread()创建的numpy数组。 However, this code is really slow.然而,这段代码真的很慢。 Since the im is in numpy array I firstly tried the ufunc of numpy with something like this:由于im是numpy的阵列我首先尝试了ufunc numpy的用是这样的:

RGB2index = np.frompyfunc(lambda x: color2index(tuple(x)))
indices = RGB2index(im)

But it turns out that the ufunc takes only one element each time.但事实证明ufunc每次只需要一个元素。 I was unable to give the function three arguments(RGB value) one time.我一次无法为该函数提供三个参数(RGB 值)。

So is there any other ways to do the optimization?那么有没有其他方法可以进行优化? The mapping has not to be that way, if a more efficient data structure exists.如果存在更有效的数据结构,则映射不必如此。 I noticed that the access of a Python dict dose not cost much time, but the casting from numpy array to tuple (which is hashable) does.我注意到访问 Python dict 不会花费太多时间,但是从numpy 数组元组(它是可散列的)的转换会花费很多时间。

PS: One idea I got is to implement a kernel in CUDA. PS:我得到的一个想法是在 CUDA 中实现一个内核。 But it would be more complicated.但是会比较复杂。

UPDATA1: Dan Mašek's Answer works fine. UPDATA1: Dan Mašek 的回答工作正常。 But first we have to convert the RGB image to grayscale.但首先我们必须将 RGB 图像转换为灰度。 It could be problematic when two colors have the same grayscale value.当两种颜色具有相同的灰度值时,可能会出现问题。

I paste the working code here.我在这里粘贴工作代码。 Hope it could help others.希望它可以帮助其他人。

lut = np.ones(256, dtype=np.uint8) * 255
lut[[255,29,179,150,226,76]] = np.arange(6, dtype=np.uint8)
im_out = cv2.LUT(cv2.cvtColor(im, cv2.COLOR_BGR2GRAY), lut)

What about this?那这个呢?

color2index = {
    (255, 255, 255) : 0,
    (0,     0, 255) : 1,
    (0,   255, 255) : 2,
    (0,   255,   0) : 3,
    (255, 255,   0) : 4,
    (255,   0,   0) : 5
}

def rgb2mask(img):

    assert len(img.shape) == 3
    height, width, ch = img.shape
    assert ch == 3

    W = np.power(256, [[0],[1],[2]])

    img_id = img.dot(W).squeeze(-1) 
    values = np.unique(img_id)

    mask = np.zeros(img_id.shape)

    for i, c in enumerate(values):
        try:
            mask[img_id==c] = color2index[tuple(img[img_id==c][0])] 
        except:
            pass
    return mask

Then just call:然后只需调用:

mask = rgb2mask(ing)

actually for-loop takes much time.实际上for循环需要很多时间。

binary_mask = (im_array[:,:,0] == 255) & (im_array[:,:,1] == 255) & (im_array[:,:,2] == 0) 

maybe above code can help you也许上面的代码可以帮助你

I've implemented a naive function: … I firstly tried the ufunc of numpy with something like this: …我已经实现了一个简单的函数:……我首先尝试了numpyufunc类似这样的东西:……

I suggest using an even more naive function which converts just one pixel:我建议使用一个更简单的函数,它只转换一个像素:

def rgb2index(rgb):
    """
    turn a 3 channel RGB color to 1 channel index color
    """
    return color2index[tuple(rgb)]

Then using a numpy routine is a good idea, but we don't need a ufunc :然后使用numpy例程是个好主意,但我们不需要ufunc

np.apply_along_axis(rgb2index, 2, im)

Here numpy.apply_along_axis() is used to apply our rgb2index() function to the RGB slices along the last of the three axes (0, 1, 2) for the whole image im .这里numpy.apply_along_axis()用于将我们的rgb2index()函数应用于整个图像im的三个轴(0、1、2)中最后一个的 RGB 切片。

We could even do without the function and just write:我们甚至可以不用这个函数而只写:

np.apply_along_axis(lambda rgb: color2index[tuple(rgb)], 2, im)

Similar to what Armali and Mendrika proposed, I somehow had to tweak it a little bit to get it to work (maybe totally my fault).与 Armali 和 Mendrika 提出的类似,我不得不稍微调整一下才能让它工作(也许完全是我的错)。 So I just wanted to share a snippet that works.所以我只想分享一个有效的片段。

COLORS = np.array([
    [0, 0, 0],
    [0, 0, 255],
    [255, 0, 0]
])
W = np.power(255, [0, 1, 2])

HASHES = np.sum(W * COLORS, axis=-1)
HASH2COLOR = {h : c for h, c in zip(HASHES, COLORS)}
HASH2IDX = {h: i for i, h in enumerate(HASHES)}


def rgb2index(segmentation_rgb):
    """
    turn a 3 channel RGB color to 1 channel index color
    """
    s_shape = segmentation_rgb.shape
    s_hashes = np.sum(W * segmentation_rgb, axis=-1)
    func = lambda x: HASH2IDX[int(x)]
    segmentation_idx = np.apply_along_axis(func, 0, s_hashes.reshape((1, -1)))
    segmentation_idx = segmentation_idx.reshape(s_shape[:2])
    return segmentation_idx

segmentation = np.array([[0, 0, 0], [0, 0, 255], [255, 0, 0]] * 3).reshape((3, 3, 3))
rgb2index(segmentation)

Example plot示例图

The code is also available here: https://github.com/theRealSuperMario/supermariopy/blob/dev/scripts/rgb2labels.py该代码也可在此处获得: https : //github.com/theRealSuperMario/supermariopy/blob/dev/scripts/rgb2labels.py

Did you check Pillow library https://python-pillow.org/ ?你检查枕头图书馆https://python-pillow.org/吗? As I remember, it has some classes and methods to deal with color conversion.我记得,它有一些类和方法来处理颜色转换。 See: https://pillow.readthedocs.io/en/4.0.x/reference/Image.html#PIL.Image.Image.convert请参阅: https : //pillow.readthedocs.io/en/4.0.x/reference/Image.html#PIL.Image.Image.convert

Here's a small utility function to convert images (np.array) to per-pixel labels (indices), which can also be a one-hot encoding:这是一个将图像(np.array)转换为每像素标签(索引)的小实用函数,它也可以是单热编码:

def rgb2label(img, color_codes = None, one_hot_encode=False):
    if color_codes is None:
        color_codes = {val:i for i,val in enumerate(set( tuple(v) for m2d in img for v in m2d ))}
    n_labels = len(color_codes)
    result = np.ndarray(shape=img.shape[:2], dtype=int)
    result[:,:] = -1
    for rgb, idx in color_codes.items():
        result[(img==rgb).all(2)] = idx

    if one_hot_encode:
        one_hot_labels = np.zeros((img.shape[0],img.shape[1],n_labels))
        # one-hot encoding
        for c in range(n_labels):
            one_hot_labels[: , : , c ] = (result == c ).astype(int)
        result = one_hot_labels

    return result, color_codes


img = cv2.imread("input_rgb_for_labels.png")
img_labels, color_codes = rgb2label(img)
print(color_codes) # e.g. to see what the codebook is

img1 = cv2.imread("another_rgb_for_labels.png")
img1_labels, _ = rgb2label(img1, color_codes) # use the same codebook

It calculates (and returns) the color codebook if None is supplied.如果None提供,它会计算(并返回)颜色码本。

If you are happy using MATLAB - maybe saving the result as *.mat and loading with scipy.io.loadmat - there is the rgb2ind function in MATLAB, which does exactly what you are asking for.如果您喜欢使用 MATLAB - 也许将结果保存为*.mat并使用scipy.io.loadmat加载 - MATLAB 中有rgb2ind函数,它完全符合您的要求。 If not, it could be used as inspiration for a similar implementation in Python.如果没有,它可以用作 Python 中类似实现的灵感。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM