[英]Slice 2d array into smaller 2d arrays
Is there a way to slice a 2d array in numpy into smaller 2d arrays?有没有办法将 numpy 中的二维数组切成较小的二维数组?
Example例子
[[1,2,3,4], -> [[1,2] [3,4]
[5,6,7,8]] [5,6] [7,8]]
So I basically want to cut down a 2x4 array into 2 2x2 arrays.所以我基本上想把一个 2x4 的数组切成 2 个 2x2 的数组。 Looking for a generic solution to be used on images.
寻找用于图像的通用解决方案。
There was another question a couple of months ago which clued me in to the idea of using reshape
and swapaxes
.几个月前还有另一个问题让我想到了使用
reshape
和swapaxes
的想法。 The h//nrows
makes sense since this keeps the first block's rows together. h//nrows
有意义,因为这将第一个块的行保持在一起。 It also makes sense that you'll need nrows
and ncols
to be part of the shape.您需要
nrows
和ncols
作为形状的一部分也是有道理的。 -1
tells reshape to fill in whatever number is necessary to make the reshape valid. -1
告诉 reshape 填写使 reshape 有效所需的任何数字。 Armed with the form of the solution, I just tried things until I found the formula that works.有了解决方案的形式,我只是尝试了一些事情,直到找到有效的公式。
You should be able to break your array into "blocks" using some combination of reshape
and swapaxes
:您应该能够使用
reshape
和swapaxes
某种组合将数组分解为“块”:
def blockshaped(arr, nrows, ncols):
"""
Return an array of shape (n, nrows, ncols) where
n * nrows * ncols = arr.size
If arr is a 2D array, the returned array should look like n subblocks with
each subblock preserving the "physical" layout of arr.
"""
h, w = arr.shape
assert h % nrows == 0, f"{h} rows is not evenly divisible by {nrows}"
assert w % ncols == 0, f"{w} cols is not evenly divisible by {ncols}"
return (arr.reshape(h//nrows, nrows, -1, ncols)
.swapaxes(1,2)
.reshape(-1, nrows, ncols))
turns c
转弯
c
np.random.seed(365)
c = np.arange(24).reshape((4, 6))
print(c)
[out]:
[[ 0 1 2 3 4 5]
[ 6 7 8 9 10 11]
[12 13 14 15 16 17]
[18 19 20 21 22 23]]
into进入
print(blockshaped(c, 2, 3))
[out]:
[[[ 0 1 2]
[ 6 7 8]]
[[ 3 4 5]
[ 9 10 11]]
[[12 13 14]
[18 19 20]]
[[15 16 17]
[21 22 23]]]
I've posted an inverse function, unblockshaped
, here , and an N-dimensional generalization here .我已经张贴了反函数,
unblockshaped
,在这里,和N维泛化这里。 The generalization gives a little more insight into the reasoning behind this algorithm.泛化让我们更深入地了解该算法背后的推理。
Note that there is also superbatfish's blockwise_view
.请注意,还有superbatfish 的
blockwise_view
。 It arranges the blocks in a different format (using more axes) but it has the advantage of (1) always returning a view and (2) being capable of handling arrays of any dimension.它以不同的格式(使用更多轴)排列块,但它的优点是(1)总是返回一个视图和(2)能够处理任何维度的数组。
It seems to me that this is a task for numpy.split
or some variant.在我看来,这是
numpy.split
或某些变体的任务。
eg例如
a = np.arange(30).reshape([5,6]) #a.shape = (5,6)
a1 = np.split(a,3,axis=1)
#'a1' is a list of 3 arrays of shape (5,2)
a2 = np.split(a, [2,4])
#'a2' is a list of three arrays of shape (2,5), (2,5), (1,5)
If you have a NxN image you can create, eg, a list of 2 NxN/2 subimages, and then divide them along the other axis.如果您有 NxN 图像,您可以创建,例如,2 NxN/2 子图像的列表,然后沿另一个轴划分它们。
numpy.hsplit
and numpy.vsplit
are also available. numpy.hsplit
和numpy.vsplit
也可用。
There are some other answers that seem well-suited for your specific case already, but your question piqued my interest in the possibility of a memory-efficient solution usable up to the maximum number of dimensions that numpy supports, and I ended up spending most of the afternoon coming up with possible method.还有一些其他答案似乎已经非常适合您的特定情况,但是您的问题激起了我对内存高效解决方案的可能性的兴趣,该解决方案可使用最多 numpy 支持的最大维度数,而我最终花费了大部分时间下午想出了可能的方法。 (The method itself is relatively simple, it's just that I still haven't used most of the really fancy features that numpy supports so most of the time was spent researching to see what numpy had available and how much it could do so that I didn't have to do it.)
(方法本身比较简单,只是我还没有使用 numpy 支持的大部分真正花哨的功能,所以大部分时间都花在研究 numpy 有什么可用以及它可以做多少事情上,所以我没有不必这样做。)
def blockgen(array, bpa):
"""Creates a generator that yields multidimensional blocks from the given
array(_like); bpa is an array_like consisting of the number of blocks per axis
(minimum of 1, must be a divisor of the corresponding axis size of array). As
the blocks are selected using normal numpy slicing, they will be views rather
than copies; this is good for very large multidimensional arrays that are being
blocked, and for very large blocks, but it also means that the result must be
copied if it is to be modified (unless modifying the original data as well is
intended)."""
bpa = np.asarray(bpa) # in case bpa wasn't already an ndarray
# parameter checking
if array.ndim != bpa.size: # bpa doesn't match array dimensionality
raise ValueError("Size of bpa must be equal to the array dimensionality.")
if (bpa.dtype != np.int # bpa must be all integers
or (bpa < 1).any() # all values in bpa must be >= 1
or (array.shape % bpa).any()): # % != 0 means not evenly divisible
raise ValueError("bpa ({0}) must consist of nonzero positive integers "
"that evenly divide the corresponding array axis "
"size".format(bpa))
# generate block edge indices
rgen = (np.r_[:array.shape[i]+1:array.shape[i]//blk_n]
for i, blk_n in enumerate(bpa))
# build slice sequences for each axis (unfortunately broadcasting
# can't be used to make the items easy to operate over
c = [[np.s_[i:j] for i, j in zip(r[:-1], r[1:])] for r in rgen]
# Now to get the blocks; this is slightly less efficient than it could be
# because numpy doesn't like jagged arrays and I didn't feel like writing
# a ufunc for it.
for idxs in np.ndindex(*bpa):
blockbounds = tuple(c[j][idxs[j]] for j in range(bpa.size))
yield array[blockbounds]
You question practically the same as this one .你的问题和这个问题几乎一样。 You can use the one-liner with
np.ndindex()
and reshape()
:您可以使用带有
np.ndindex()
和reshape()
np.ndindex()
:
def cutter(a, r, c):
lenr = a.shape[0]/r
lenc = a.shape[1]/c
np.array([a[i*r:(i+1)*r,j*c:(j+1)*c] for (i,j) in np.ndindex(lenr,lenc)]).reshape(lenr,lenc,r,c)
To create the result you want:要创建您想要的结果:
a = np.arange(1,9).reshape(2,1)
#array([[1, 2, 3, 4],
# [5, 6, 7, 8]])
cutter( a, 1, 2 )
#array([[[[1, 2]],
# [[3, 4]]],
# [[[5, 6]],
# [[7, 8]]]])
Some minor enhancement to TheMeaningfulEngineer's answer that handles the case when the big 2d array cannot be perfectly sliced into equally sized subarrays对 TheMeaningfulEngineer 的回答进行了一些小改进,用于处理无法将大 2d 数组完美地分割为相同大小的子数组的情况
def blockfy(a, p, q):
'''
Divides array a into subarrays of size p-by-q
p: block row size
q: block column size
'''
m = a.shape[0] #image row size
n = a.shape[1] #image column size
# pad array with NaNs so it can be divided by p row-wise and by q column-wise
bpr = ((m-1)//p + 1) #blocks per row
bpc = ((n-1)//q + 1) #blocks per column
M = p * bpr
N = q * bpc
A = np.nan* np.ones([M,N])
A[:a.shape[0],:a.shape[1]] = a
block_list = []
previous_row = 0
for row_block in range(bpc):
previous_row = row_block * p
previous_column = 0
for column_block in range(bpr):
previous_column = column_block * q
block = A[previous_row:previous_row+p, previous_column:previous_column+q]
# remove nan columns and nan rows
nan_cols = np.all(np.isnan(block), axis=0)
block = block[:, ~nan_cols]
nan_rows = np.all(np.isnan(block), axis=1)
block = block[~nan_rows, :]
## append
if block.size:
block_list.append(block)
return block_list
Examples:例子:
a = np.arange(25)
a = a.reshape((5,5))
out = blockfy(a, 2, 3)
a->
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
out[0] ->
array([[0., 1., 2.],
[5., 6., 7.]])
out[1]->
array([[3., 4.],
[8., 9.]])
out[-1]->
array([[23., 24.]])
For now it just works when the big 2d array can be perfectly sliced into equally sized subarrays.现在它只在大二维数组可以完美地分割成相同大小的子数组时才起作用。
The code bellow slices代码波纹管切片
a ->array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23]])
into this进入这个
block_array->
array([[[ 0, 1, 2],
[ 6, 7, 8]],
[[ 3, 4, 5],
[ 9, 10, 11]],
[[12, 13, 14],
[18, 19, 20]],
[[15, 16, 17],
[21, 22, 23]]])
p
ang q
determine the block size p
ang q
确定块大小
Code代码
a = arange(24)
a = a.reshape((4,6))
m = a.shape[0] #image row size
n = a.shape[1] #image column size
p = 2 #block row size
q = 3 #block column size
block_array = []
previous_row = 0
for row_block in range(blocks_per_row):
previous_row = row_block * p
previous_column = 0
for column_block in range(blocks_per_column):
previous_column = column_block * q
block = a[previous_row:previous_row+p,previous_column:previous_column+q]
block_array.append(block)
block_array = array(block_array)
If you want a solution that also handles the cases when the matrix is not equally divided, you can use this:如果您想要一个也能处理矩阵不等分的情况的解决方案,您可以使用这个:
from operator import add
half_split = np.array_split(input, 2)
res = map(lambda x: np.array_split(x, 2, axis=1), half_split)
res = reduce(add, res)
Here is a solution based on unutbu's answer that handle case where matrix cannot be equally divided.这是基于 unutbu 的答案的解决方案,该解决方案处理矩阵不能等分的情况。 In this case, it will resize the matrix before using some interpolation.
在这种情况下,它会在使用一些插值之前调整矩阵的大小。 You need OpenCV for this.
为此,您需要 OpenCV。 Note that I had to swap
ncols
and nrows
to make it works, didn't figured why.请注意,我必须交换
ncols
和nrows
才能使其正常工作,但不知道为什么。
import numpy as np
import cv2
import math
def blockshaped(arr, r_nbrs, c_nbrs, interp=cv2.INTER_LINEAR):
"""
arr a 2D array, typically an image
r_nbrs numbers of rows
r_cols numbers of cols
"""
arr_h, arr_w = arr.shape
size_w = int( math.floor(arr_w // c_nbrs) * c_nbrs )
size_h = int( math.floor(arr_h // r_nbrs) * r_nbrs )
if size_w != arr_w or size_h != arr_h:
arr = cv2.resize(arr, (size_w, size_h), interpolation=interp)
nrows = int(size_w // r_nbrs)
ncols = int(size_h // c_nbrs)
return (arr.reshape(r_nbrs, ncols, -1, nrows)
.swapaxes(1,2)
.reshape(-1, ncols, nrows))
a = np.random.randint(1, 9, size=(9,9))
out = [np.hsplit(x, 3) for x in np.vsplit(a,3)]
print(a)
print(out)
yields产量
[[7 6 2 4 4 2 5 2 3]
[2 3 7 6 8 8 2 6 2]
[4 1 3 1 3 8 1 3 7]
[6 1 1 5 7 2 1 5 8]
[8 8 7 6 6 1 8 8 4]
[6 1 8 2 1 4 5 1 8]
[7 3 4 2 5 6 1 2 7]
[4 6 7 5 8 2 8 2 8]
[6 6 5 5 6 1 2 6 4]]
[[array([[7, 6, 2],
[2, 3, 7],
[4, 1, 3]]), array([[4, 4, 2],
[6, 8, 8],
[1, 3, 8]]), array([[5, 2, 3],
[2, 6, 2],
[1, 3, 7]])], [array([[6, 1, 1],
[8, 8, 7],
[6, 1, 8]]), array([[5, 7, 2],
[6, 6, 1],
[2, 1, 4]]), array([[1, 5, 8],
[8, 8, 4],
[5, 1, 8]])], [array([[7, 3, 4],
[4, 6, 7],
[6, 6, 5]]), array([[2, 5, 6],
[5, 8, 2],
[5, 6, 1]]), array([[1, 2, 7],
[8, 2, 8],
[2, 6, 4]])]]
I publish my solution.我发布了我的解决方案。 Notice that this code doesn't' actually create copies of original array, so it works well with big data.
请注意,此代码实际上并未创建原始数组的副本,因此它适用于大数据。 Moreover, it doesn't crash if array cannot be divided evenly (but you can easly add condition for that by deleting
ceil
and checking if v_slices
and h_slices
are divided without rest).此外,如果数组不能被均匀划分,它也不会崩溃(但您可以通过删除
ceil
并检查v_slices
和h_slices
是否被分割而v_slices
添加条件)。
import numpy as np
from math import ceil
a = np.arange(9).reshape(3, 3)
p, q = 2, 2
width, height = a.shape
v_slices = ceil(width / p)
h_slices = ceil(height / q)
for h in range(h_slices):
for v in range(v_slices):
block = a[h * p : h * p + p, v * q : v * q + q]
# do something with a block
This code changes (or, more precisely, gives you direct access to part of an array) this:此代码更改(或更准确地说,使您可以直接访问数组的一部分):
[[0 1 2]
[3 4 5]
[6 7 8]]
Into this:进入这个:
[[0 1]
[3 4]]
[[2]
[5]]
[[6 7]]
[[8]]
If you need actual copies, Aenaon code is what you are looking for.如果您需要实际副本, Aenaon 代码就是您要找的。
If you are sure that big array can be divided evenly, you can usenumpy splitting tools.如果确定大数组可以平均分割,可以使用numpy分割工具。
to add to @Aenaon answer and his blockfy function, if you are working with COLOR IMAGES/ 3D ARRAY here is my pipeline to create crops of 224 x 224 for 3 channel input添加到@Aenaon 答案和他的 blockfy 功能,如果您正在使用彩色图像/3D 阵列,这里是我的管道,用于为 3 通道输入创建 224 x 224 的裁剪
def blockfy(a, p, q):
'''
Divides array a into subarrays of size p-by-q
p: block row size
q: block column size
'''
m = a.shape[0] #image row size
n = a.shape[1] #image column size
# pad array with NaNs so it can be divided by p row-wise and by q column-wise
bpr = ((m-1)//p + 1) #blocks per row
bpc = ((n-1)//q + 1) #blocks per column
M = p * bpr
N = q * bpc
A = np.nan* np.ones([M,N])
A[:a.shape[0],:a.shape[1]] = a
block_list = []
previous_row = 0
for row_block in range(bpc):
previous_row = row_block * p
previous_column = 0
for column_block in range(bpr):
previous_column = column_block * q
block = A[previous_row:previous_row+p, previous_column:previous_column+q]
# remove nan columns and nan rows
nan_cols = np.all(np.isnan(block), axis=0)
block = block[:, ~nan_cols]
nan_rows = np.all(np.isnan(block), axis=1)
block = block[~nan_rows, :]
## append
if block.size:
block_list.append(block)
return block_list
then extended above to然后在上面扩展到
for file in os.listdir(path_to_crop): ### list files in your folder
img = io.imread(path_to_crop + file, as_gray=False) ### open image
r = blockfy(img[:,:,0],224,224) ### crop blocks of 224 x 224 for red channel
g = blockfy(img[:,:,1],224,224) ### crop blocks of 224 x 224 for green channel
b = blockfy(img[:,:,2],224,224) ### crop blocks of 224 x 224 for blue channel
for x in range(0,len(r)):
img = np.array((r[x],g[x],b[x])) ### combine each channel into one patch by patch
img = img.astype(np.uint8) ### cast back to proper integers
img_swap = img.swapaxes(0, 2) ### need to swap axes due to the way things were proceesed
img_swap_2 = img_swap.swapaxes(0, 1) ### do it again
Image.fromarray(img_swap_2).save(path_save_crop+str(x)+"bounding" + file,
format = 'jpeg',
subsampling=0,
quality=100) ### save patch with new name etc
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.