简体   繁体   English

在python中将2d数组转换为3d数组

[英]Converting 2d array into 3d array in python

Sorry for asking this question,if it is asked already, but in my case I have a special matrix of size 3000000x50 that I want to split it into 300 matrices of size 10000x50. 很抱歉问这个问题,如果已经问过,但是在我的情况下,我有一个特殊的矩阵,大小为3000000x50,我想将其拆分为300个大小为10000x50的矩阵。 I tried this but it is not working 我尝试了这个,但是没有用

>>>import numpy as np
>>>data=np.random.randn(3000000,50)
>>>D=np.matrix.conjugate(data)
>>>ts=50
>>>ts=int(ts)       #number of time series that we have from our data
>>>lw=1e4
>>>lw=int(lw)    #length of each window 
>>>l=len(data)/lw   #l is number of windows
>>>l=np.floor(l)
>>>l=int(l)
#Dc is used to seperate each time series in l windows
>>>Dc=np.zeros((l,lw,ts))
>>>for i in range(l):
    Dc[i][0:lw-1][0:ts-1]=D[(lw)*(i):(lw*(i+1))-1][0:ts-1]

You are looking for np.vsplit ( Split an array into multiple sub-arrays vertically (row-wise) ) - 您正在寻找np.vsplit将阵列垂直(行式)分割为多个子阵列 )-

np.vsplit(data,300)

Sample run - 样品运行-

In [56]: data
Out[56]: 
array([[ 0.46677419,  0.07402051,  0.87270029,  0.12481164],
       [ 0.40789713,  0.36018843,  0.41731607,  0.17348898],
       [ 0.4701256 ,  0.10056201,  0.31289602,  0.18681709],
       [ 0.52407036,  0.89913995,  0.59097535,  0.38376443],
       [ 0.06734662,  0.24470334,  0.09523911,  0.35680219],
       [ 0.91178257,  0.58710922,  0.75099017,  0.24929987]])

In [57]: np.vsplit(data,3)
Out[57]: 
[array([[ 0.46677419,  0.07402051,  0.87270029,  0.12481164],
        [ 0.40789713,  0.36018843,  0.41731607,  0.17348898]]),
 array([[ 0.4701256 ,  0.10056201,  0.31289602,  0.18681709],
        [ 0.52407036,  0.89913995,  0.59097535,  0.38376443]]),
 array([[ 0.06734662,  0.24470334,  0.09523911,  0.35680219],
        [ 0.91178257,  0.58710922,  0.75099017,  0.24929987]])]

Depending on how you are going to use the output, you can just reshape the 2D input array into a 3D array that is of length 300 along the first axis, which must be much more efficient in terms of performance and memory. 根据您将如何使用输出,您可以将2D输入数组整形为沿第一轴长度为300的3D数组,这在性能和内存方面必须有更高的效率。 Memorywise it must be free as reshaping creates just a view of the numpy array. 在内存方面,它必须是自由的,因为reshaping只会创建numpy数组的视图。 The implementation would be - 实施将是-

data.reshape(300,-1,data.shape[1])

Sample run - 样品运行-

In [68]: data
Out[68]: 
array([[ 0.46677419,  0.07402051,  0.87270029,  0.12481164],
       [ 0.40789713,  0.36018843,  0.41731607,  0.17348898],
       [ 0.4701256 ,  0.10056201,  0.31289602,  0.18681709],
       [ 0.52407036,  0.89913995,  0.59097535,  0.38376443],
       [ 0.06734662,  0.24470334,  0.09523911,  0.35680219],
       [ 0.91178257,  0.58710922,  0.75099017,  0.24929987]])

In [69]: data.reshape(3,-1,data.shape[1])
Out[69]: 
array([[[ 0.46677419,  0.07402051,  0.87270029,  0.12481164],
        [ 0.40789713,  0.36018843,  0.41731607,  0.17348898]],

       [[ 0.4701256 ,  0.10056201,  0.31289602,  0.18681709],
        [ 0.52407036,  0.89913995,  0.59097535,  0.38376443]],

       [[ 0.06734662,  0.24470334,  0.09523911,  0.35680219],
        [ 0.91178257,  0.58710922,  0.75099017,  0.24929987]]])

Here's some runtime tests to check for performance comparing actually splitting versus reshaping - 这是一些运行时测试,用于比较实际拆分与重塑的性能-

In [72]: data = np.random.rand(6000,40)

In [73]: %timeit np.vsplit(data,300)
100 loops, best of 3: 7.05 ms per loop

In [74]: %timeit data.reshape(300,-1,data.shape[1])
1000000 loops, best of 3: 1.08 µs per loop

If your initial array is correctly sorted and you want to split the array in 300 matrix "boxes", you can just need the following redefinition of the marix 如果您对初始数组进行了正确排序,并且希望将数组拆分为300个矩阵“框”,则只需重新定义marix

import numpy as np
data = np.random.randn(3000000,50)
newData = data.reshape(300,10000,50) # This is as [300,10000,50] array

print newData[0,...] # Show the first matrix, 1 of 300

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM