简体   繁体   English

创建没有固定第二维的3D numpy.ndarray

[英]Creating 3D numpy.ndarray with no fixed second dimension

Sometimes data, such as speech data, have a known number of observations (n), an unknown duration, and a known number of measurements (k). 有时,诸如语音数据之类的数据具有已知数目的观测值(n),未知持续时间和已知数目的测量值(k)。

In the 2D case in NumPy, it is clear how data with a known number of observations (n) and an unknown duration is represented with an ndarray of shape (n, ) . 在NumPy中的2D情况下,很清楚如何用形状为(n, )的ndarray表示观察次数已知(n)且持续时间未知的数据。 For example: 例如:

import numpy as np

x = np.array([ [ 1, 2 ],
               [ 1, 2, 3 ]
             ])

print(x.shape) ### Returns: (2, )

Is there an equivalent for the 3D case in NumPy, where we could have an ndarray of shape (n, , k) ? NumPy中的3D情况是否有等效项,在这里我们可以有形状为(n, , k)ndarray The best alternative to this I can think of is to have a 2D ndarray of shape (n, ) and have each element also be 2D with a (transpose) shape of (k, ) . 我能想到的最好的替代方法是使用形状为(n, )的2D ndarray (n, )并使每个元素也是(k, )的(转置)形状的2D。 For example, 例如,

import numpy as np

x = np.array([ [ [1,2], [1,2] ],
               [ [1,2], [1,2], [1,2] ]
             ])

print(x.shape) ### Returns: (2, ); Desired: (2, , 2)

Ideally, a solution would be able to tell us the dimensionality properties of an ndarray without the need for a recursive call (maybe with an alternative to shape ?). 理想情况下,一种解决方案将能够告诉我们ndarray的维数属性,而无需递归调用(也许可以使用shape ?的替代方法)。

You seem to have misunderstood what a shape of (2,) means. 您似乎误解了(2,)含义。 It doesn't mean (2, <unknown>) ; 这并不意味着(2, <unknown>) ; the comma is not a separator between 2 and some sort of blank dimension. 逗号不是2到某种空白尺寸之间的分隔符。 (2,) is the Python syntax for a one-element tuple whose one element is 2 . (2,)是一个元素为2的单元素元组的Python语法。 Python uses this syntax because (2) would mean the integer 2 , not a tuple. Python使用此语法,因为(2)表示整数2 ,而不是元组。

You are not creating a two-dimensional array with an arbitrary-length second dimension. 您不是在创建具有任意长度的第二维的二维数组。 You are creating a one-dimensional array of object dtype. 您正在创建对象dtype的一维数组。 Its elements are ordinary Python lists. 它的元素是普通的Python列表。 An array like this is incompatible with almost every useful thing in NumPy. 这样的数组与NumPy中几乎所有有用的东西都不兼容。

There is no way to create NumPy arrays with variable-length dimensions, whether in the 2D case you thought worked, or in the 3D case you're trying to make work. 无论是在您认为可行的2D情况下,还是在尝试进行工作的3D情况下,都无法创建具有可变长度尺寸的NumPy数组。

Just to review the 1d case: 仅审查一维案例:

In [33]: x = np.array([[1,2],[1,2,3]])                                          
In [34]: x.shape                                                                
Out[34]: (2,)
In [35]: x                                                                      
Out[35]: array([list([1, 2]), list([1, 2, 3])], dtype=object)

The result is a 2 element array of lists, where as we started with a list of lists. 结果是一个2元素的列表数组,从列表列表开始。 Not much difference. 没有太大的区别。

But note that if the lists are same size, np.array creates a numeric 2d array: 但是请注意,如果列表大小相同,则np.array会创建一个数字2d数组:

In [36]: x = np.array([[1,2,4],[1,2,3]])                                        
In [37]: x                                                                      
Out[37]: 
array([[1, 2, 4],
       [1, 2, 3]])

So don't count on the behavior we see in [33]. 因此,不要指望我们在[33]中看到的行为。

I could create a 2d object array: 我可以创建一个二维对象数组:

In [59]: x = np.empty((2,2),object)                                             
In [60]: x                                                                      
Out[60]: 
array([[None, None],                  # in this case filled with None
       [None, None]], dtype=object)

I can assign each element with a different kind and size of object: 我可以为每个元素分配不同种类和大小的对象:

In [61]: x[0,0] = np.arange(3)                                                  
In [62]: x[0,0] = [1,2,3]                                                       
In [63]: x[1,0] = 'abc'                                                         
In [64]: x[1,1] = np.arange(6).reshape(2,3)                                     
In [65]: x                                                                      
Out[65]: 
array([[list([1, 2, 3]), None],
       ['abc', array([[0, 1, 2],
       [3, 4, 5]])]], dtype=object)

It is still 2d. 它仍然是2d。 For most purposes it is like a list or list of lists, containing objects. 在大多数情况下,它就像一个包含对象的列表或列表列表。 The databuffer actually has pointers to objects stored else where in memory (just as list buffer does). 数据缓冲区实际上具有指向存储在内存中其他位置的对象的指针(就像列表缓冲区一样)。

There really isn't such a thing as a 3d array with a variable last dimension. 确实没有像3d数组那样具有可变的最后维度的东西。 At best we can get a 2d array that contains lists or arrays of various sizes. 充其量我们可以得到一个二维数组,其中包含各种大小的列表或数组。


Make a list of 2 2d arrays: 列出2个2d数组:

In [69]: alist = [np.arange(6).reshape(2,3), np.arange(4.).reshape(2,2)]        
In [70]: alist                                                                  
Out[70]: 
[array([[0, 1, 2],
        [3, 4, 5]]), array([[0., 1.],
        [2., 3.]])]

In this case, giving it to np.array raises an error: In [71]: np.array(alist) 在这种情况下,将其提供给np.array会引发错误:在[71]中:np.array(alist)
--------------------------------------------------------------------------- ValueError: could not broadcast input array from shape (2,3) into shape (2) -------------------------------------------------- ------------------------- ValueError:无法将输入数组从形状(2,3)广播到形状(2)

We could fill an object array with elements from this list: 我们可以用以下列表中的元素填充对象数组:

In [72]: x = np.empty((4,),object)                                              
In [73]: x[0]=alist[0][0]                                                       
In [74]: x[1]=alist[0][1]                                                       
In [75]: x[2]=alist[1][0]                                                       
In [76]: x[3]=alist[1][1]                                                       
In [77]: x                                                                      
Out[77]: 
array([array([0, 1, 2]), array([3, 4, 5]), array([0., 1.]),
       array([2., 3.])], dtype=object)

and reshape it to 2d 并将其重塑为2d

In [78]: x.reshape(2,2)                                                         
Out[78]: 
array([[array([0, 1, 2]), array([3, 4, 5])],
       [array([0., 1.]), array([2., 3.])]], dtype=object)

Result is a 2d array containing 1d arrays. 结果是一个包含一维数组的二维数组。 To get the shapes of the elements I have to do something like: 为了获得元素的形状,我必须做一些事情:

In [87]: np.frompyfunc(lambda i:i.shape, 1,1)(Out[78])                          
Out[87]: 
array([[(3,), (3,)],
       [(2,), (2,)]], dtype=object)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM