简体   繁体   English

如何对 Numpy 结构化数组进行按列运算?

[英]How to do columnwise operations with Numpy structured arrays?

This shows the problem nicely:这很好地说明了问题:

import numpy as np

a_type = np.dtype([("x", int), ("y", float)])
a_list = []

for i in range(0, 8, 2):
    entry = np.zeros((1,), dtype=a_type)
    entry["x"][0] = i
    entry["y"][0] = i + 1.0
    a_list.append(entry)
a_array = np.array(a_list, dtype=a_type)
a_array_flat = a_array.reshape(-1)
print(a_array_flat["x"])
print(np.sum(a_array_flat["x"]))

and this produces the trackback and output:这会产生引用和输出:

[0 2 4 6]
Traceback (most recent call last):
  File "/home/andreas/src/masiri/booking_algorythm/demo_structured_aarray_flatten.py", line 14, in <module>
    print(np.sum(a_array_flat["x"]))
  File "<__array_function__ internals>", line 180, in sum
  File "/home/andreas/src/masiri/venv/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 2298, in sum
    return _wrapreduction(a, np.add, 'sum', axis, dtype, out, keepdims=keepdims,
  File "/home/andreas/src/masiri/venv/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 86, in _wrapreduction
    return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
numpy.core._exceptions._UFuncNoLoopError: ufunc 'add' did not contain a loop with signature matching types (dtype({'names': ['x'], 'formats': ['<i8'], 'offsets': [0], 'itemsize': 16}), dtype({'names': ['x'], 'formats': ['<i8'], 'offsets': [0], 'itemsize': 16})) -> None

I chose this data structure because I must do many column-wise operations fast and have more esoteric types like timedelta64 and datetime64 , too.我选择这种数据结构是因为我必须快速执行许多按列的操作,并且还有更多深奥的类型,如timedelta64datetime64 I am sure basic Numpy operations work, and I overlook something obvious.我确信基本的 Numpy 操作有效,但我忽略了一些明显的事情。 Please help me.请帮我。

In an ipython session, your code runs fine:ipython会话中,您的代码运行良好:

In [2]: a_type = np.dtype([("x", int), ("y", float)])
   ...: a_list = []
   ...: 
   ...: for i in range(0, 8, 2):
   ...:     entry = np.zeros((1,), dtype=a_type)
   ...:     entry["x"][0] = i
   ...:     entry["y"][0] = i + 1.0
   ...:     a_list.append(entry)
   ...: a_array = np.array(a_list, dtype=a_type)
   ...: a_array_flat = a_array.reshape(-1)

In [3]: a_list
Out[3]: 
[array([(0, 1.)], dtype=[('x', '<i4'), ('y', '<f8')]),
 array([(2, 3.)], dtype=[('x', '<i4'), ('y', '<f8')]),
 array([(4, 5.)], dtype=[('x', '<i4'), ('y', '<f8')]),
 array([(6, 7.)], dtype=[('x', '<i4'), ('y', '<f8')])]

In [4]: a_array
Out[4]: 
array([[(0, 1.)],
       [(2, 3.)],
       [(4, 5.)],
       [(6, 7.)]], dtype=[('x', '<i4'), ('y', '<f8')])

In [5]: a_array_flat
Out[5]: 
array([(0, 1.), (2, 3.), (4, 5.), (6, 7.)],
      dtype=[('x', '<i4'), ('y', '<f8')])

In [6]: a_array_flat['x']
Out[6]: array([0, 2, 4, 6])

In [7]: np.sum(a_array_flat["x"])
Out[7]: 12

The error message almost looks like you are indexing with field list:错误消息几乎看起来像您正在使用字段列表进行索引:

In [8]: np.sum(a_array_flat[["x"]])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [8], in <cell line: 1>()
----> 1 np.sum(a_array_flat[["x"]])

File <__array_function__ internals>:5, in sum(*args, **kwargs)

...
TypeError: cannot perform reduce with flexible type

In [9]: a_array_flat[["x"]]
Out[9]: 
array([(0,), (2,), (4,), (6,)],
      dtype={'names':['x'], 'formats':['<i4'], 'offsets':[0], 'itemsize':12})

What numpy version are you using?你用的是什么numpy版本? There was a period where numpy versions flipped-flopped on how they handled views of the array.有一段时间,numpy 版本在处理数组views的方式上反复无常。

Doing the sum on the unflattened array:对未展平的数组求和:

In [11]: a_array["x"]
Out[11]: 
array([[0],
       [2],
       [4],
       [6]])

In [12]: a_array["x"].sum()
Out[12]: 12

Another way of constructing this array:构造此数组的另一种方法:

In [15]: import numpy.lib.recfunctions as rf
In [16]: arr = np.arange(8).reshape(4,2);arr
Out[16]: 
array([[0, 1],
       [2, 3],
       [4, 5],
       [6, 7]])

In [17]: arr1 = rf.unstructured_to_structured(arr, dtype=a_type)    
In [18]: arr1
Out[18]: 
array([(0, 1.), (2, 3.), (4, 5.), (6, 7.)],
      dtype=[('x', '<i4'), ('y', '<f8')])

In [19]: arr1['x']
Out[19]: array([0, 2, 4, 6])

or:要么:

In [20]: arr2 = np.zeros(4, a_type)
In [21]: arr2['x']=arr[:,0]; arr2['y']=arr[:,1]
In [22]: arr2
Out[22]: 
array([(0, 1.), (2, 3.), (4, 5.), (6, 7.)],
      dtype=[('x', '<i4'), ('y', '<f8')])

edit编辑

I get your error message with the python sum (as opposed to np.sum , which I showed above).我用 python sum收到你的错误消息(与我在上面显示的np.sum相反)。

In [26]: sum(a_array[['x']])
---------------------------------------------------------------------------
UFuncTypeError                            Traceback (most recent call last)
Input In [26], in <cell line: 1>()
----> 1 sum(a_array[['x']])

UFuncTypeError: ufunc 'add' did not contain a loop with signature matching types (dtype('int32'), dtype({'names':['x'], 'formats':['<i4'], 'offsets':[0], 'itemsize':12})) -> None

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM