讓我們為python的numpy做一個N維像素分組/分組的參考實現

Question

我經常想要像素bin /像素桶一個numpy數組，意思是用一個像素替換N個連續像素的組，這個像素是N替換像素的總和。 例如，從值開始：

x = np.array([1, 3, 7, 3, 2, 9])

桶大小為2時，轉換為：

bucket(x, bucket_size=2) 
= [1+3, 7+3, 2+9]
= [4, 10, 11]

據我所知，沒有專門做這個的numpy功能（請糾正我，如果我錯了！），所以我經常推出自己的。 對於1d numpy數組，這不錯：

import numpy as np

def bucket(x, bucket_size):
    return x.reshape(x.size // bucket_size, bucket_size).sum(axis=1)

bucket_me = np.array([3, 4, 5, 5, 1, 3, 2, 3])
print(bucket(bucket_me, bucket_size=2)) #[ 7 10  4  5]

...但是，我很容易對多維案例感到困惑，最后我一遍又一遍地推動自己的錯誤，半解決這個“簡單”的問題。 如果我們能夠建立一個漂亮的N維參考實現，我會喜歡它。

優選地，函數調用將允許沿不同軸的不同的bin大小（可能類似於bucket(x, bucket_size=(2, 2, 3)) ）
優選地，解決方案將是合理有效的（重塑和總和相當快速的numpy）
當數組沒有很好地划分為整數個桶時，處理邊緣效應的加成點。
允許用戶選擇初始bin邊緣偏移的加分點。

正如Divakar所建議的那樣，這是我在樣本2-D案例中所希望的行為：

x = np.array([[1, 2, 3, 4],
              [2, 3, 7, 9],
              [8, 9, 1, 0],
              [0, 0, 3, 4]])

bucket(x, bucket_size=(2, 2))
= [[1 + 2 + 2 + 3, 3 + 4 + 7 + 9],
   [8 + 9 + 0 + 0, 1 + 0 + 3 + 4]]
= [[8, 23],
   [17, 8]]

...希望我正確地做了算術;）

Answer 1

我認為你可以使用skimage的view_as_blocks完成大部分繁瑣的工作。 此函數使用as_strided實現，因此它非常有效（它只是更改步幅信息以重塑數組）。 因為它是用Python / NumPy編寫的，所以如果你沒有安裝skimage，你總是可以復制代碼。

應用該函數后，您只需要對重新整形的數組的N個尾軸求和（其中N是bucket_size元組的長度）。 這是一個新的bucket()函數：

from skimage.util import view_as_blocks

def bucket(x, bucket_size):
    blocks = view_as_blocks(x, bucket_size)
    tup = tuple(range(-len(bucket_size), 0))
    return blocks.sum(axis=tup)

然后例如：

>>> x = np.array([1, 3, 7, 3, 2, 9])
>>> bucket(x, bucket_size=(2,))
array([ 4, 10, 11])

>>> x = np.array([[1, 2, 3, 4],
                  [2, 3, 7, 9],
                  [8, 9, 1, 0],
                  [0, 0, 3, 4]])

>>> bucket(x, bucket_size=(2, 2))
array([[ 8, 23],
       [17,  8]])

>>> y = np.arange(6*6*6).reshape(6,6,6)
>>> bucket(y, bucket_size=(2, 2, 3))
array([[[ 264,  300],
        [ 408,  444],
        [ 552,  588]],

       [[1128, 1164],
        [1272, 1308],
        [1416, 1452]],

       [[1992, 2028],
        [2136, 2172],
        [2280, 2316]]])

Answer 2

要為ndarray案例指定沿每個軸的不同bin大小，可以沿着它的每個軸迭代地使用np.add.reduceat ，就像這樣 -

def bucket(x, bin_size):
    ndims = x.ndim
    out = x.copy()
    for i in range(ndims):
        idx = np.append(0,np.cumsum(bin_size[i][:-1]))
        out = np.add.reduceat(out,idx,axis=i)
    return out

樣品運行 -

In [126]: x
Out[126]: 
array([[165, 107, 133,  82, 199],
       [ 35, 138,  91, 100, 207],
       [ 75,  99,  40, 240, 208],
       [166, 171,  78,   7, 141]])

In [127]: bucket(x, bin_size = [[2, 2],[3, 2]])
Out[127]: 
array([[669, 588],
       [629, 596]])

#  [2, 2] are the bin sizes along axis=0
#  [3, 2] are the bin sizes along axis=1

# array([[165, 107, 133, | 82, 199],
#        [ 35, 138,  91, | 100, 207],
# -------------------------------------
#        [ 75,  99, 40,  | 240, 208],
#        [166, 171, 78,  | 7, 141]])

In [128]: x[:2,:3].sum()
Out[128]: 669

In [129]: x[:2,3:].sum()
Out[129]: 588

In [130]: x[2:,:3].sum()
Out[130]: 629

In [131]: x[2:,3:].sum()
Out[131]: 596

Answer 3

本地來自as_strided：

x = array([[1, 2, 3, 4],
           [2, 3, 7, 9],
           [8, 9, 1, 0],
           [0, 0, 3, 4]])

from numpy.lib.stride_tricks import as_strided     
def bucket(x,bucket_size):
      x=np.ascontiguousarray(x)
      oldshape=array(x.shape)
      newshape=concatenate((oldshape//bucket_size,bucket_size))
      oldstrides=array(x.strides)
      newstrides=concatenate((oldstrides*bucket_size,oldstrides))
      axis=tuple(range(x.ndim,2*x.ndim))
      return as_strided (x,newshape,newstrides).sum(axis)

如果尺寸未均勻分配到x的相應尺寸，則剩余元素將丟失。

驗證：

In [9]: bucket(x,(2,2))
Out[9]: 
array([[ 8, 23],
       [17,  8]])

讓我們為python的numpy做一個N維像素分組/分組的參考實現

問題描述

3 個解決方案

解決方案1
4 2016-03-28 19:36:21

解決方案2
1 2016-03-28 20:02:47

解決方案3
1 已采納 2016-03-28 20:11:14

讓我們為python的numpy做一個N維像素分組/分組的參考實現

問題描述

3 個解決方案

解決方案1 4 2016-03-28 19:36:21

解決方案2 1 2016-03-28 20:02:47

解決方案3 1 已采納 2016-03-28 20:11:14

解決方案1
4 2016-03-28 19:36:21

解決方案2
1 2016-03-28 20:02:47

解決方案3
1 已采納 2016-03-28 20:11:14