简体   繁体   English

将二维数组切成更小的二维数组

[英]Slice 2d array into smaller 2d arrays

Is there a way to slice a 2d array in numpy into smaller 2d arrays?有没有办法将 numpy 中的二维数组切成较小的二维数组?

Example例子

[[1,2,3,4],   ->    [[1,2] [3,4]   
 [5,6,7,8]]          [5,6] [7,8]]

So I basically want to cut down a 2x4 array into 2 2x2 arrays.所以我基本上想把一个 2x4 的数组切成 2 个 2x2 的数组。 Looking for a generic solution to be used on images.寻找用于图像的通用解决方案。

There was another question a couple of months ago which clued me in to the idea of using reshape and swapaxes .几个月前还有另一个问题让我想到了使用reshapeswapaxes的想法。 The h//nrows makes sense since this keeps the first block's rows together. h//nrows有意义,因为这将第一个块的行保持在一起。 It also makes sense that you'll need nrows and ncols to be part of the shape.您需要nrowsncols作为形状的一部分也是有道理的。 -1 tells reshape to fill in whatever number is necessary to make the reshape valid. -1告诉 reshape 填写使 reshape 有效所需的任何数字。 Armed with the form of the solution, I just tried things until I found the formula that works.有了解决方案的形式,我只是尝试了一些事情,直到找到有效的公式。

You should be able to break your array into "blocks" using some combination of reshape and swapaxes :您应该能够使用reshapeswapaxes某种组合将数组分解为“块”:

def blockshaped(arr, nrows, ncols):
    """
    Return an array of shape (n, nrows, ncols) where
    n * nrows * ncols = arr.size

    If arr is a 2D array, the returned array should look like n subblocks with
    each subblock preserving the "physical" layout of arr.
    """
    h, w = arr.shape
    assert h % nrows == 0, f"{h} rows is not evenly divisible by {nrows}"
    assert w % ncols == 0, f"{w} cols is not evenly divisible by {ncols}"
    return (arr.reshape(h//nrows, nrows, -1, ncols)
               .swapaxes(1,2)
               .reshape(-1, nrows, ncols))

turns c转弯c

np.random.seed(365)
c = np.arange(24).reshape((4, 6))
print(c)

[out]:
[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]
 [18 19 20 21 22 23]]

into进入

print(blockshaped(c, 2, 3))

[out]:
[[[ 0  1  2]
  [ 6  7  8]]

 [[ 3  4  5]
  [ 9 10 11]]

 [[12 13 14]
  [18 19 20]]

 [[15 16 17]
  [21 22 23]]]

I've posted an inverse function, unblockshaped , here , and an N-dimensional generalization here .我已经张贴了反函数, unblockshaped ,在这里,和N维泛化这里 The generalization gives a little more insight into the reasoning behind this algorithm.泛化让我们更深入地了解该算法背后的推理。


Note that there is also superbatfish's blockwise_view .请注意,还有superbatfish 的blockwise_view It arranges the blocks in a different format (using more axes) but it has the advantage of (1) always returning a view and (2) being capable of handling arrays of any dimension.它以不同的格式(使用更多轴)排列块,但它的优点是(1)总是返回一个视图和(2)能够处理任何维度的数组。

It seems to me that this is a task for numpy.split or some variant.在我看来,这是numpy.split或某些变体的任务。

eg例如

a = np.arange(30).reshape([5,6])  #a.shape = (5,6)
a1 = np.split(a,3,axis=1) 
#'a1' is a list of 3 arrays of shape (5,2)
a2 = np.split(a, [2,4])
#'a2' is a list of three arrays of shape (2,5), (2,5), (1,5)

If you have a NxN image you can create, eg, a list of 2 NxN/2 subimages, and then divide them along the other axis.如果您有 NxN 图像,您可以创建,例如,2 NxN/2 子图像的列表,然后沿另一个轴划分它们。

numpy.hsplit and numpy.vsplit are also available. numpy.hsplitnumpy.vsplit也可用。

There are some other answers that seem well-suited for your specific case already, but your question piqued my interest in the possibility of a memory-efficient solution usable up to the maximum number of dimensions that numpy supports, and I ended up spending most of the afternoon coming up with possible method.还有一些其他答案似乎已经非常适合您的特定情况,但是您的问题激起了我对内存高效解决方案的可能性的兴趣,该解决方案可使用最多 numpy 支持的最大维度数,而我最终花费了大部分时间下午想出了可能的方法。 (The method itself is relatively simple, it's just that I still haven't used most of the really fancy features that numpy supports so most of the time was spent researching to see what numpy had available and how much it could do so that I didn't have to do it.) (方法本身比较简单,只是我还没有使用 numpy 支持的大部分真正花哨的功能,所以大部分时间都花在研究 numpy 有什么可用以及它可以做多少事情上,所以我没有不必这样做。)

def blockgen(array, bpa):
    """Creates a generator that yields multidimensional blocks from the given
array(_like); bpa is an array_like consisting of the number of blocks per axis
(minimum of 1, must be a divisor of the corresponding axis size of array). As
the blocks are selected using normal numpy slicing, they will be views rather
than copies; this is good for very large multidimensional arrays that are being
blocked, and for very large blocks, but it also means that the result must be
copied if it is to be modified (unless modifying the original data as well is
intended)."""
    bpa = np.asarray(bpa) # in case bpa wasn't already an ndarray

    # parameter checking
    if array.ndim != bpa.size:         # bpa doesn't match array dimensionality
        raise ValueError("Size of bpa must be equal to the array dimensionality.")
    if (bpa.dtype != np.int            # bpa must be all integers
        or (bpa < 1).any()             # all values in bpa must be >= 1
        or (array.shape % bpa).any()): # % != 0 means not evenly divisible
        raise ValueError("bpa ({0}) must consist of nonzero positive integers "
                         "that evenly divide the corresponding array axis "
                         "size".format(bpa))


    # generate block edge indices
    rgen = (np.r_[:array.shape[i]+1:array.shape[i]//blk_n]
            for i, blk_n in enumerate(bpa))

    # build slice sequences for each axis (unfortunately broadcasting
    # can't be used to make the items easy to operate over
    c = [[np.s_[i:j] for i, j in zip(r[:-1], r[1:])] for r in rgen]

    # Now to get the blocks; this is slightly less efficient than it could be
    # because numpy doesn't like jagged arrays and I didn't feel like writing
    # a ufunc for it.
    for idxs in np.ndindex(*bpa):
        blockbounds = tuple(c[j][idxs[j]] for j in range(bpa.size))

        yield array[blockbounds]

You question practically the same as this one .你的问题和这个问题几乎一样 You can use the one-liner with np.ndindex() and reshape() :您可以使用带有np.ndindex()reshape() np.ndindex()

def cutter(a, r, c):
    lenr = a.shape[0]/r
    lenc = a.shape[1]/c
    np.array([a[i*r:(i+1)*r,j*c:(j+1)*c] for (i,j) in np.ndindex(lenr,lenc)]).reshape(lenr,lenc,r,c)

To create the result you want:要创建您想要的结果:

a = np.arange(1,9).reshape(2,1)
#array([[1, 2, 3, 4],
#       [5, 6, 7, 8]])

cutter( a, 1, 2 )
#array([[[[1, 2]],
#        [[3, 4]]],
#       [[[5, 6]],
#        [[7, 8]]]])

Some minor enhancement to TheMeaningfulEngineer's answer that handles the case when the big 2d array cannot be perfectly sliced into equally sized subarrays对 TheMeaningfulEngineer 的回答进行了一些小改进,用于处理无法将大 2d 数组完美地分割为相同大小的子数组的情况

def blockfy(a, p, q):
    '''
    Divides array a into subarrays of size p-by-q
    p: block row size
    q: block column size
    '''
    m = a.shape[0]  #image row size
    n = a.shape[1]  #image column size

    # pad array with NaNs so it can be divided by p row-wise and by q column-wise
    bpr = ((m-1)//p + 1) #blocks per row
    bpc = ((n-1)//q + 1) #blocks per column
    M = p * bpr
    N = q * bpc

    A = np.nan* np.ones([M,N])
    A[:a.shape[0],:a.shape[1]] = a

    block_list = []
    previous_row = 0
    for row_block in range(bpc):
        previous_row = row_block * p   
        previous_column = 0
        for column_block in range(bpr):
            previous_column = column_block * q
            block = A[previous_row:previous_row+p, previous_column:previous_column+q]

            # remove nan columns and nan rows
            nan_cols = np.all(np.isnan(block), axis=0)
            block = block[:, ~nan_cols]
            nan_rows = np.all(np.isnan(block), axis=1)
            block = block[~nan_rows, :]

            ## append
            if block.size:
                block_list.append(block)

    return block_list

Examples:例子:

a = np.arange(25)
a = a.reshape((5,5))
out = blockfy(a, 2, 3)

a->
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

out[0] ->
array([[0., 1., 2.],
       [5., 6., 7.]])

out[1]->
array([[3., 4.],
       [8., 9.]])

out[-1]->
array([[23., 24.]])

For now it just works when the big 2d array can be perfectly sliced into equally sized subarrays.现在它只在大二维数组可以完美地分割成相同大小的子数组时才起作用。

The code bellow slices代码波纹管切片

a ->array([[ 0,  1,  2,  3,  4,  5],
           [ 6,  7,  8,  9, 10, 11],
           [12, 13, 14, 15, 16, 17],
           [18, 19, 20, 21, 22, 23]])

into this进入这个

block_array->
    array([[[ 0,  1,  2],
            [ 6,  7,  8]],

           [[ 3,  4,  5],
            [ 9, 10, 11]],

           [[12, 13, 14],
            [18, 19, 20]],

           [[15, 16, 17],
            [21, 22, 23]]])

p ang q determine the block size p ang q确定块大小

Code代码

a = arange(24)
a = a.reshape((4,6))
m = a.shape[0]  #image row size
n = a.shape[1]  #image column size

p = 2     #block row size
q = 3     #block column size

block_array = []
previous_row = 0
for row_block in range(blocks_per_row):
    previous_row = row_block * p   
    previous_column = 0
    for column_block in range(blocks_per_column):
        previous_column = column_block * q
        block = a[previous_row:previous_row+p,previous_column:previous_column+q]
        block_array.append(block)

block_array = array(block_array)

If you want a solution that also handles the cases when the matrix is not equally divided, you can use this:如果您想要一个也能处理矩阵不等分的情况的解决方案,您可以使用这个:

from operator import add
half_split = np.array_split(input, 2)

res = map(lambda x: np.array_split(x, 2, axis=1), half_split)
res = reduce(add, res)

Here is a solution based on unutbu's answer that handle case where matrix cannot be equally divided.这是基于 unutbu 的答案的解决方案,该解决方案处理矩阵不能等分的情况。 In this case, it will resize the matrix before using some interpolation.在这种情况下,它会在使用一些插值之前调整矩阵的大小。 You need OpenCV for this.为此,您需要 OpenCV。 Note that I had to swap ncols and nrows to make it works, didn't figured why.请注意,我必须交换ncolsnrows才能使其正常工作,但不知道为什么。

import numpy as np
import cv2
import math 

def blockshaped(arr, r_nbrs, c_nbrs, interp=cv2.INTER_LINEAR):
    """
    arr      a 2D array, typically an image
    r_nbrs   numbers of rows
    r_cols   numbers of cols
    """

    arr_h, arr_w = arr.shape

    size_w = int( math.floor(arr_w // c_nbrs) * c_nbrs )
    size_h = int( math.floor(arr_h // r_nbrs) * r_nbrs )

    if size_w != arr_w or size_h != arr_h:
        arr = cv2.resize(arr, (size_w, size_h), interpolation=interp)

    nrows = int(size_w // r_nbrs)
    ncols = int(size_h // c_nbrs)

    return (arr.reshape(r_nbrs, ncols, -1, nrows) 
               .swapaxes(1,2)
               .reshape(-1, ncols, nrows))
a = np.random.randint(1, 9, size=(9,9))
out = [np.hsplit(x, 3) for x in np.vsplit(a,3)]
print(a)
print(out)

yields产量

[[7 6 2 4 4 2 5 2 3]
 [2 3 7 6 8 8 2 6 2]
 [4 1 3 1 3 8 1 3 7]
 [6 1 1 5 7 2 1 5 8]
 [8 8 7 6 6 1 8 8 4]
 [6 1 8 2 1 4 5 1 8]
 [7 3 4 2 5 6 1 2 7]
 [4 6 7 5 8 2 8 2 8]
 [6 6 5 5 6 1 2 6 4]]
[[array([[7, 6, 2],
       [2, 3, 7],
       [4, 1, 3]]), array([[4, 4, 2],
       [6, 8, 8],
       [1, 3, 8]]), array([[5, 2, 3],
       [2, 6, 2],
       [1, 3, 7]])], [array([[6, 1, 1],
       [8, 8, 7],
       [6, 1, 8]]), array([[5, 7, 2],
       [6, 6, 1],
       [2, 1, 4]]), array([[1, 5, 8],
       [8, 8, 4],
       [5, 1, 8]])], [array([[7, 3, 4],
       [4, 6, 7],
       [6, 6, 5]]), array([[2, 5, 6],
       [5, 8, 2],
       [5, 6, 1]]), array([[1, 2, 7],
       [8, 2, 8],
       [2, 6, 4]])]]

I publish my solution.我发布了我的解决方案。 Notice that this code doesn't' actually create copies of original array, so it works well with big data.请注意,此代码实际上并未创建原始数组的副本,因此它适用于大数据。 Moreover, it doesn't crash if array cannot be divided evenly (but you can easly add condition for that by deleting ceil and checking if v_slices and h_slices are divided without rest).此外,如果数组不能被均匀划分,它也不会崩溃(但您可以通过删除ceil并检查v_slicesh_slices是否被分割而v_slices添加条件)。

import numpy as np
from math import ceil

a = np.arange(9).reshape(3, 3)

p, q = 2, 2
width, height = a.shape

v_slices = ceil(width / p)
h_slices = ceil(height / q)

for h in range(h_slices):
    for v in range(v_slices):
        block = a[h * p : h * p + p, v * q : v * q + q]
        # do something with a block

This code changes (or, more precisely, gives you direct access to part of an array) this:此代码更改(或更准确地说,使您可以直接访问数组的一部分):

[[0 1 2]
 [3 4 5]
 [6 7 8]]

Into this:进入这个:

[[0 1]
 [3 4]]
[[2]
 [5]]
[[6 7]]
[[8]]

If you need actual copies, Aenaon code is what you are looking for.如果您需要实际副本, Aenaon 代码就是您要找的。

If you are sure that big array can be divided evenly, you can usenumpy splitting tools.如果确定大数组可以平均分割,可以使用numpy分割工具。

to add to @Aenaon answer and his blockfy function, if you are working with COLOR IMAGES/ 3D ARRAY here is my pipeline to create crops of 224 x 224 for 3 channel input添加到@Aenaon 答案和他的 blockfy 功能,如果您正在使用彩色图像/3D 阵列,这里是我的管道,用于为 3 通道输入创建 224 x 224 的裁剪

def blockfy(a, p, q):
'''
Divides array a into subarrays of size p-by-q
p: block row size
q: block column size
'''
m = a.shape[0]  #image row size
n = a.shape[1]  #image column size

# pad array with NaNs so it can be divided by p row-wise and by q column-wise
bpr = ((m-1)//p + 1) #blocks per row
bpc = ((n-1)//q + 1) #blocks per column
M = p * bpr
N = q * bpc

A = np.nan* np.ones([M,N])
A[:a.shape[0],:a.shape[1]] = a

block_list = []
previous_row = 0
for row_block in range(bpc):
    previous_row = row_block * p   
    previous_column = 0
    for column_block in range(bpr):
        previous_column = column_block * q
        block = A[previous_row:previous_row+p, previous_column:previous_column+q]

        # remove nan columns and nan rows
        nan_cols = np.all(np.isnan(block), axis=0)
        block = block[:, ~nan_cols]
        nan_rows = np.all(np.isnan(block), axis=1)
        block = block[~nan_rows, :]

        ## append
        if block.size:
            block_list.append(block)

return block_list

then extended above to然后在上面扩展到

for file in os.listdir(path_to_crop):   ### list files in your folder
   img = io.imread(path_to_crop + file, as_gray=False) ### open image 

   r = blockfy(img[:,:,0],224,224)  ### crop blocks of 224 x 224 for red channel
   g = blockfy(img[:,:,1],224,224)  ### crop blocks of 224 x 224 for green channel
   b = blockfy(img[:,:,2],224,224)  ### crop blocks of 224 x 224 for blue channel

   for x in range(0,len(r)):
       img = np.array((r[x],g[x],b[x])) ### combine each channel into one patch by patch

       img = img.astype(np.uint8) ### cast back to proper integers

       img_swap = img.swapaxes(0, 2) ### need to swap axes due to the way things were proceesed
       
       img_swap_2 = img_swap.swapaxes(0, 1) ### do it again

       Image.fromarray(img_swap_2).save(path_save_crop+str(x)+"bounding" + file,
                                        format = 'jpeg',
                                        subsampling=0,
                                        quality=100) ### save patch with new name etc 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM