使用值作為索引沿新維度折疊一個numpy數組

Question

我有一個[m,m] numpy數組，其元素位於{0, 1, 2, ..., 24} ，現在我想在三維中分離每個數字以獲得[m,m,24]數組。

一個簡單的示例， [5,5]數組，元素在{0, 1, 2, 3}

現在我需要一個`[5,5,3]`數組

img = np.expand_dims(img, axis=2)
for i in range(24):
    img_norm[..., i] = (img[..., 0] == (i + np.ones(shape=img[..., 0].shape)))

目前，我有一個簡單的方法，但是它的計算量非常大。 因為我需要經常執行此操作。

 img = np.expand_dims(img, axis=2) for i in range(24): img_norm[..., i] = (img[..., 0] == (i + np.ones(shape=img[..., 0].shape)))

對於大小為[224,224]且元素位於{0, 1, 2, ..., 24} 64數組，上面的代碼大約需要5s 。

有更快的方法嗎？

Answer 1

以下內容對我來說非常快捷：

import numpy as np
max_num = 3
img = np.array([
    [0,0,1,0,0],
    [2,0,3,0,1],
    [0,2,3,1,0],
    [0,0,1,0,0],
    [1,0,2,0,1],
    ])

img_norm = np.zeros(img.shape + (max_num,))
for idx in range(1, max_num + 1):
    img_norm[idx-1,:,:]=idx*(img == idx)

用指定大小的隨機數組進行測試；

max_num = 24
img = np.int64((max_num+1)*np.random.rand(224, 224)) # Random array

img_norm = np.zeros(img.shape + (max_num,))
for idx in range(1, max_num + 1):
    img_norm[idx-1,:,:]=img*(img == idx)

在我的機器上幾乎不需要花費任何時間。

def getnorm_acdr(img):
    max_num = np.max(img)
    img_norm = np.zeros([max_num, *img.shape])    
    for idx in range(1, max_num + 1):
        img_norm[idx-1,:,:]=img*(img == idx)

img = np.int64((max_num+1)*np.random.rand(224, 224))

%timeit getnorm_acdr(img)

給出：

11.9 ms ± 536 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Answer 2

絕對更優雅：使用np.ndenumerate() 。

for (i,j), val in np.ndenumerate(img):
    img_norm[val-1,i,j] = val

看起來這應該比您的快，因為O（N ^ 2）而不是O（N ^ 3）。 讓我們在描述的具有大小和內容的數組上嘗試一下：

def getnorm_ndenumerate(img):
    img_norm = np.zeros([np.max(img), *img.shape])
    for (i,j), val in np.ndenumerate(img):
        img_norm[val-1,i,j] = val  
    return img_norm

b = np.int64(25*np.random.rand(224, 224)) 

%timeit getnorm_ndenumerate(b)

給

47.8 ms ± 1.38 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

它確實比您的要快。 但是，優雅要付出代價，因為它比acdr的方法要慢。

Answer 3

我犯了一個錯誤，在輸出數組中，所有非零都應該為1。對不起，我很傻。

感謝你的幫助。 我測試了上述三種方法，包括Jean-François Corbett ， acdr + Jean-François Corbett和我的代碼。 事實證明， acdr + Jean-François Corbett的方法是最快的。

這是我的測試代碼

def test_time():
    def func1(img, max_num):
        w, h = img.shape
        img_norm = np.zeros([w, h, max_num], np.float32)
        for (i, j), val in np.ndenumerate(img):
            # img_norm[i, j, val - 1] = val
            img_norm[i, j, val - 1] = 0 if val == 0 else 1
        return img_norm

    def func2(img, max_num):
        w, h = img.shape
        img_norm = np.zeros([w, h, max_num], np.float32)
        for idx in range(1, max_num + 1):
            # img_norm[:, :, idx - 1] = idx*(img == idx)
            img_norm[:, :, idx - 1] = (img == idx)
        return img_norm

    def func3(img, max_num):
        w, h = img.shape
        img_norm = np.zeros([w, h, max_num], np.float32)
        for idx in range(max_num):
            # img_norm[:, :, idx] = (idx+1) * (img[:, :, 0] == (idx + np.ones(shape=img[:, :, 0].shape)))
            img_norm[:, :, idx] = (img == (idx + np.ones(shape=img.shape)))
        return img_norm

    import cv2
    img_tmp = cv2.imread('dat.png', cv2.IMREAD_UNCHANGED)
    img_tmp = np.asarray(img_tmp, np.int)

    # img_tmp = np.array([
    #     [0, 0, 1, 0, 0],
    #     [2, 0, 3, 0, 1],
    #     [0, 2, 3, 1, 0],
    #     [0, 0, 1, 0, 0],
    #     [1, 0, 2, 0, 1],
    # ])

    img_bkp = np.array(img_tmp, copy=True)
    print(img_bkp.shape)
    import time
    cnt = 100
    maxnum = 24
    start_time = time.time()
    for i in range(cnt):
        _ = func1(img_tmp, maxnum)
    print('1 total time =', time.time() - start_time)

    start_time = time.time()
    for i in range(cnt):
        _ = func2(img_tmp, maxnum)
    print('2 total time =', time.time() - start_time)

    start_time = time.time()
    for i in range(cnt):
        _ = func3(img_tmp, maxnum)
    print('3 total time =', time.time() - start_time)

    print((img_tmp == img_bkp).all())
    img1 = func1(img_tmp, maxnum)
    img2 = func2(img_tmp, maxnum)
    img3 = func3(img_tmp, maxnum)
    print(img1.shape, img2.shape, img3.shape)
    print((img1 == img2).all())
    print((img2 == img3).all())
    print((img1 == img3).all())
    # print(type(img1[0, 0, 0]), type(img2[0, 0, 0]), type(img3[0, 0, 0]))
    # print('img1\n', img1[:, :, 2])
    # print('img3\n', img3[:, :, 2])

輸出是

  (224, 224) 1 total time = 4.738261938095093 2 total time = 0.7725710868835449 3 total time = 1.5980615615844727 True (224, 224, 24) (224, 224, 24) (224, 224, 24) True True True

如果有任何問題，請在評論中發布。

多謝您的協助！

使用值作為索引沿新維度折疊一個numpy數組

問題描述

3 個解決方案

解決方案1
3 已采納 2019-01-10 10:14:07

解決方案2
1 2019-01-10 10:16:48

解決方案3
0 2019-01-10 13:01:46

使用值作為索引沿新維度折疊一個numpy數組

問題描述

3 個解決方案

解決方案1 3 已采納 2019-01-10 10:14:07

解決方案2 1 2019-01-10 10:16:48

解決方案3 0 2019-01-10 13:01:46

解決方案1
3 已采納 2019-01-10 10:14:07

解決方案2
1 2019-01-10 10:16:48

解決方案3
0 2019-01-10 13:01:46