简体   繁体   English

numpy.reshape 在处理大尺寸数组时中断

[英]numpy.reshape breaks when dealing with an array of a big size

I have an initial NumPy array of size (512,512,100) of type np.float64 and then I use view_as_windows function to get an array of size (499,499,100,64,64) .我有一个 np.float64 类型的初始 NumPy 数组,大小为(512,512,100) np.float64然后我使用view_as_windows function 来获取大小为(499,499,100,64,64)的数组。 This function returns a view that consumes much less memory than an actual NumPy array.此 function 返回的视图消耗的 memory 比实际的 NumPy 数组少得多。 I want to reshape the view to (499*499*100,64,64) .我想将视图重塑为(499*499*100,64,64) Using the regular np.reshape function, it takes too much time trying to reshape the array, and then it breaks due to trying to convert it to an actual NumPy array.使用常规的np.reshape function,尝试重塑数组需要花费太多时间,然后由于尝试将其转换为实际的 NumPy 数组而中断。 I tried the following, and none of them works:我尝试了以下方法,但它们都不起作用:

#gs.shape gives (499,499,100,64,64)
gs.shape = (gs.shape[0]*gs.shape[1]*gs.shape[2],64,64,)
gs = np.reshape(gs,(gs.shape[0]*gs.shape[1]*gs.shape[2],64,64,))
gs = gs.reshape(gs.shape[0]*gs.shape[1]*gs.shape[2],64,64,)

What is the correct way to change the view without actually causing memory overload?在不实际导致 memory 过载的情况下更改视图的正确方法是什么?

Your starting array (with a small dtype):您的起始数组(具有小的 dtype):

In [100]: arr = np.ones((512,512,100), 'int8')    
In [101]: arr.nbytes, arr.strides
Out[101]: (26214400, (51200, 100, 1))

Using the view_as_windows as ported to the numpy core:使用移植到view_as_windows内核的 view_as_windows:

In [102]: new = np.lib.stride_tricks.sliding_window_view(arr,(64,64),(0,1))

In [103]: new.shape, new.strides
Out[103]: ((449, 449, 100, 64, 64), 
          (51200, 100, 1, 51200, 100))

In [105]: new.nbytes/1e9        # the number of elements (memory if copy)
Out[105]: 82.5757696

Note that the arr final dimension, 100, is now 3rd;请注意, arr最终维度 100 现在是第 3 个; in the view is still has a single element stride.view中仍然具有单个元素的步幅。

Effectively the first dim, 512, has been 'split' between axis 0 and 3;实际上,第一个暗淡 512 已在轴 0 和轴 3 之间“拆分”; and the 2nd 512, axis (1,4).和第二个 512,轴 (1,4)。

To get the shape, (499 499 100,64,64), strides would have to be (? , 51200, 100).要获得形状 (499 499 100,64,64),步幅必须为 (? , 51200, 100)。 I'm not an expert on as_strided , but have enough experience to say there's no way of reworking the first 3 strides (51200, 100, 1) into one.我不是as_strided方面的专家,但有足够的经验可以说没有办法将前 3 步 (51200, 100, 1) 改造成一个。

Let's remove the size 100 dimension to simplify things a bit.让我们删除尺寸 100 的尺寸以简化一些事情。

In [106]: arr = np.ones((512,512), 'int8')
In [107]: arr.strides
Out[107]: (512, 1)

The windowed:开窗的:

In [108]: new = np.lib.stride_tricks.sliding_window_view(arr,(64,64),(0,1))
In [109]: new.shape, new.strides
Out[109]: ((449, 449, 64, 64), (512, 1, 512, 1))

and its 3d reshape (a copy):及其 3d 重塑(副本):

In [110]: new1 = new.reshape(-1,64,64)    
In [112]: new1.shape, new1.strides
Out[112]: ((201601, 64, 64), (4096, 64, 1))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM