简体   繁体   English

在 Numpy 中拆分多维数组

[英]Splitting multidimensional array in Numpy

I'm trying to split a multidimensional array ( array )我正在尝试拆分多维数组 ( array )

import numpy as np

shape = (3, 4, 4, 2)
array = np.random.randint(0,10,shape)

into an array ( new_array ) with shape (3,2,2,2,2,2) where the dimension 1 has been split into 2 (dimension 1 and 2) and dimension 2 in array has been split into 2 (dimensions 3 and 4).到形状为(3,2,2,2,2,2)的数组 ( new_array ) 中,其中维度 1 已被拆分为 2(维度 1 和 2), array维度 2 已拆分为 2(维度 3 和4)。

So far I got a working method which is:到目前为止,我得到了一种工作方法:

div_x = 2
div_y = 2
new_dim_x = shape[1]//div_x
new_dim_y = shape[2]//div_y

new_array_split = np.array([np.split(each_sub, axis=2, indices_or_sections=div_y) for each_sub in np.split(array[:, :(new_dim_x*div_x), :(new_dim_y*div_y)], axis=1, indices_or_sections=div_x)]) 

I'm also looking into using reshape :我也在考虑使用reshape

new_array_reshape = array[:, :(div_x*new_dim_x), :(div_y*new_dim_y), ...].reshape(shape[0], div_x, div_y, new_dim_x, new_dim_y, shape[-1]).transpose(1,2,0,3,4,5)

The reshape method is faster than the split method: reshape方法比split方法更快:

%timeit array[:, :(div_x*new_dim_x), :(div_y*new_dim_y), ...].reshape(shape[0], div_x, div_y, new_dim_x, new_dim_y, shape[-1]).transpose(1,2,0,3,4,5)
2.16 µs ± 44.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit np.array([np.split(each_sub, axis=2, indices_or_sections=div_y) for each_sub in np.split(array[:, :(new_dim_x*div_x), :(new_dim_y*div_y)], axis=1, indices_or_sections=div_x)])
58.3 µs ± 2.13 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

However, I cannot get the same results, because of the last dimension:但是,由于最后一个维度,我无法得到相同的结果:

print('Reshape method')
print(new_array_reshape[1,0,0,...])
print('\nSplit method')
print(new_array_split[1,0,0,...])
 
Reshape method
[[[2 2]
  [4 3]]
 [[3 5]
  [5 9]]]

Split method
[[[2 2]
  [4 3]]
 [[5 3]
  [9 8]]]

The split method does exactly what I want, I did check number by number and it does the type of split I want, but not at the speed I would like. split 方法完全符合我的要求,我逐个检查数字,它执行我想要的拆分类型,但不是我想要的速度。

QUESTION

Is there a way to achieve the same results as the split method, using reshape or any other approach?有没有办法使用 reshape 或任何其他方法获得与 split 方法相同的结果?

CONTEXT语境

The array is actually data flow from image processing, where the first dimension of array is the time, the second dimension is coordinate x (4), the third dimension is coordinate y (4) and the fourth dimension (2) is the Magnitude and phase of the flow.数组实际上是图像处理的数据流, array的第一维是时间,第二维是坐标x(4),第三维是坐标y(4),第四维(2)是Magnitude和流的阶段。

I would like to split the images (coordinate x and y) into subimages making an array of pictures of 2x2 so I can analyse the flow more locally, perform averages, clustering, etc.我想将图像(坐标 x 和 y)拆分为子图像,制作一组 2x2 的图片,以便我可以更本地地分析流,执行平均值,聚类等。

This process (splitting) is going to be performed many times that is why I'm looking for an optimal and efficient solution.这个过程(拆分)将执行多次,这就是为什么我要寻找最佳和有效的解决方案。 I believe the way is probably using reshape , but I'm open to any other option.我相信这种方式可能是使用reshape ,但我对任何其他选择持开放态度。

Reshape and permute axes -

array.reshape(3,2,2,2,2,2).transpose(1,3,0,2,4,5)

For your use case I'm not sure reshape is the best option.对于您的用例,我不确定reshape是最佳选择。 If you want to be able to locally average and cluster, you might want a window function:如果您希望能够进行本地平均和聚类,您可能需要一个窗口函数:

from skimage.util import view_as_windows

def window_over(arr, size = 2, step = 2, axes = (1, 2) ):
    wshp = list(arr.shape)
    for a in axes:
        wshp[a] = size
    return view_as_windows(arr, wshp, step).squeeze()

window_over(test).shape
Out[]: (2, 2, 3, 2, 2, 2)

Your output axes can then be rearranged how you want using transpose .然后可以使用transpose重新排列您的输出轴。 The benefit of this is that you can get the intermediate windows:这样做的好处是您可以获得中间窗口:

window_over(test, step = 1).shape
Out[]: (3, 3, 3, 2, 2, 2)

That includes the 2x2 windows that overlap, so you get 3x3 results.这包括重叠的 2x2 窗口,因此您会得到 3x3 的结果。

Since overlapping is possible, you also don't need your windows to be divisible by the dimension size:由于重叠是可能的,你也不需要你的窗口被尺寸大小整除:

window_over(test, size = 3).shape
Out[]: (2, 2, 3, 3, 3, 2)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM