获取numpy结构化数组中的所有列。

Question

I would like to slice a numpy structured array. 我想切片一个numpy结构化数组。 I have an array 我有一个阵列

>>b
>>array([([11.0, 21.0, 31.0, 0.01], [1.0, 2.0, 3.0, 0.0]),
       ([41.0, 51.0, 61.0, 0.11], [4.0, 5.0, 6.0, 0.1]),
       ([71.0, 81.0, 91.0, 0.21], [7.0, 8.0, 9.0, 0.2])], 
       dtype=[('fd', '<f8', (4,)), ('av', '<f8', (4,))])

And I want to access elements of this to create a new array similar to 我想访问这个元素来创建一个类似的新数组

>>b[:][:,0]

to get an array similar to this. 得到一个类似于此的数组。 (To get all rows in all columns at [0]). （要获取[0]中所有列中的所有行）。 (Please don't mind the parenthesis, brackets and dimensions in the following as this is not an output) （请注意以下括号，括号和尺寸，因为这不是输出）

>>array([([11.0],[1.0]),
  ([41.0],[4.0]),
  ([71.0],[7.0])],
  dtype=[('fd', '<f8', (1,)), ('av', '<f8', (1,))])

but I get this error. 但是我得到了这个错误。

>>b[:][:,0]
  Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  IndexError: too many indices for array

I would like to do this without looping the names in dtype. 我想在没有循环dtype中的名称的情况下这样做。 Thank you very much for the help. 非常感谢你的帮助。

Answer 1

You access the fields of a structured array by field name. 您可以按字段名称访问结构化数组的字段。 There isn't a way around this. 没有办法解决这个问题。 Unless the dtypes let you view it in a different way. 除非dtypes允许您以不同的方式查看它。

Lets call your desire output c . 让我们打电话给你的欲望输出c 。

In [1061]: b['fd']
Out[1061]: 
array([[  1.10000000e+01,   2.10000000e+01,   3.10000000e+01,
          1.00000000e-02],
       [  4.10000000e+01,   5.10000000e+01,   6.10000000e+01,
          1.10000000e-01],
       [  7.10000000e+01,   8.10000000e+01,   9.10000000e+01,
          2.10000000e-01]])

What I think you are trying to do is collect these values for both fields: 我认为你要做的是为这两个字段收集这些值：

In [1062]: b['fd'][:,0]
Out[1062]: array([ 11.,  41.,  71.])

In [1064]: c['fd']
Out[1064]: 
array([[ 11.],
       [ 41.],
       [ 71.]])

As I just explained in https://stackoverflow.com/a/38090370/901925 the recfunctions generally allocate a target array and copy values by field. 正如我刚才在https://stackoverflow.com/a/38090370/901925中解释的那样， recfunctions通常会分配一个目标数组并按字段复制值。

So the field iteration solution would be something like: 因此，字段迭代解决方案将是这样的：

In [1066]: c.dtype
Out[1066]: dtype([('fd', '<f8', (1,)), ('av', '<f8', (1,))])

In [1067]: b.dtype
Out[1067]: dtype([('fd', '<f8', (4,)), ('av', '<f8', (4,))])

In [1068]: d=np.zeros((b.shape), dtype=c.dtype)


In [1070]: for n in b.dtype.names:
    d[n][:] = b[n][:,[0]]

In [1071]: d
Out[1071]: 
array([([11.0], [1.0]), ([41.0], [4.0]), ([71.0], [7.0])], 
      dtype=[('fd', '<f8', (1,)), ('av', '<f8', (1,))])

================ ================

Since both fields a floats, I can view b as a 2d array; 由于两个字段都是浮点数，我可以将b视为2d数组; and select the 2 subcolumns with 2d array indexing: 并选择具有2d数组索引的2个子列：

In [1083]: b.view((float,8)).shape
Out[1083]: (3, 8)

In [1084]: b.view((float,8))[:,[0,4]]
Out[1084]: 
array([[ 11.,   1.],
       [ 41.,   4.],
       [ 71.,   7.]])

Similarly, c can be viewed as 2d 同样， c可以被视为2d

In [1085]: c.view((float,2))
Out[1085]: 
array([[ 11.,   1.],
       [ 41.,   4.],
       [ 71.,   7.]])

And I can, then port the values to a blank d with: 我可以，然后将值移到空白d ：

In [1090]: d=np.zeros((b.shape), dtype=c.dtype)

In [1091]: d.view((float,2))[:]=b.view((float,8))[:,[0,4]]

In [1092]: d
Out[1092]: 
array([([11.0], [1.0]), ([41.0], [4.0]), ([71.0], [7.0])], 
      dtype=[('fd', '<f8', (1,)), ('av', '<f8', (1,))])

So, at least in this case, we don't have to do field by field copy. 所以，至少在这种情况下，我们不必逐场复制。 But I can't say, without testing, which is faster. 但我不能说，没有测试，哪个更快。 In my previous answer I found that field by field copy was relatively fast when dealing with many rows. 在我之前的回答中，我发现在处理许多行时，字段副本的字段相对较快。

获取numpy结构化数组中的所有列。

问题描述

1 个解决方案

解决方案1
3 已采纳 2016-06-30 02:01:37

获取numpy结构化数组中的所有列。

问题描述

1 个解决方案

解决方案1 3 已采纳 2016-06-30 02:01:37

解决方案1
3 已采纳 2016-06-30 02:01:37