numpy数组到ndarray

Question

I have an exported pandas dataframe that is now a numpy.array object. 我有一个导出的熊猫数据框，现在是numpy.array对象。

subset = array[:4,:]
array([[  2.        ,  12.        ,  33.33333333,   2.        ,
         33.33333333,  12.        ],
       [  2.        ,   2.        ,  33.33333333,   2.        ,
         33.33333333,   2.        ],
       [  2.8       ,   8.        ,  45.83333333,   2.75      ,
         46.66666667,  13.        ],
       [  3.11320755,  75.        ,  56.        ,   3.24      ,
         52.83018868,  33.        ]])
print subset.dtype
dtype('float64')

I was to convert the column values to specific types, and set column names as well, this means I need to convert it to a ndarray. 我将列值转换为特定类型，并设置列名称，这意味着我需要将其转换为ndarray。

Here are my dtypes: 这是我的dtypes：

[('PERCENT_A_NEW', '<f8'), ('JoinField', '<i4'), ('NULL_COUNT_B', '<f8'), 
('PERCENT_COMP_B', '<f8'), ('RANKING_A', '<f8'), ('RANKING_B', '<f8'),
('NULL_COUNT_B', '<f8')]

When I go to convert the array, I get: 当我去转换数组时，我得到：

 ValueError: new type not compatible with array.

How do you cast each column to a specific value so I can convert the array to an ndarray? 您如何将每一列转换为特定值，以便将数组转换为ndarray？

Thanks 谢谢

Answer 1

You already have an ndarray . 您已经有一个ndarray 。 What you are seeking is a structured array, one with this compound dtype. 您正在寻找的是一个结构化数组，该数组具有此dtype。 First see if pandas can do it for you. 首先看看pandas可以为您做到。 If that fails we might be able to do something with tolist and a list comprehension. 如果失败，我们也许能够做的东西tolist和列表理解。

In [84]: dt=[('PERCENT_A_NEW', '<f8'), ('JoinField', '<i4'), ('NULL_COUNT_B', '<
    ...: f8'), 
    ...: ('PERCENT_COMP_B', '<f8'), ('RANKING_A', '<f8'), ('RANKING_B', '<f8'),
    ...: ('NULL_COUNT_B', '<f8')]
In [85]: subset=np.array([[  2.        ,  12.        ,  33.33333333,   2.       
    ...:  ,
    ...:          33.33333333,  12.        ],
    ...:        [  2.        ,   2.        ,  33.33333333,   2.        ,
    ...:          33.33333333,   2.        ],
    ...:        [  2.8       ,   8.        ,  45.83333333,   2.75      ,
    ...:          46.66666667,  13.        ],
    ...:        [  3.11320755,  75.        ,  56.        ,   3.24      ,
    ...:          52.83018868,  33.        ]])
In [86]: subset
Out[86]: 
array([[  2.        ,  12.        ,  33.33333333,   2.        ,
         33.33333333,  12.        ],
       [  2.        ,   2.        ,  33.33333333,   2.        ,
         33.33333333,   2.        ],
       [  2.8       ,   8.        ,  45.83333333,   2.75      ,
         46.66666667,  13.        ],
       [  3.11320755,  75.        ,  56.        ,   3.24      ,
         52.83018868,  33.        ]])

Now make an array with dt . 现在用dt制作一个数组。 Input for a structured array has to be a list of tuples - so I'm using tolist and a list comprehension 结构化数组的输入必须是一个元组列表-所以我正在使用tolist和列表理解

In [87]: np.array([tuple(row) for row in subset.tolist()],dtype=dt)
....
ValueError: field 'NULL_COUNT_B' occurs more than once
In [88]: subset.shape
Out[88]: (4, 6)
In [89]: dt
Out[89]: 
[('PERCENT_A_NEW', '<f8'),
 ('JoinField', '<i4'),
 ('NULL_COUNT_B', '<f8'),
 ('PERCENT_COMP_B', '<f8'),
 ('RANKING_A', '<f8'),
 ('RANKING_B', '<f8'),
 ('NULL_COUNT_B', '<f8')]
In [90]: dt=[('PERCENT_A_NEW', '<f8'), ('JoinField', '<i4'), ('NULL_COUNT_B', '<
    ...: f8'), 
    ...: ('PERCENT_COMP_B', '<f8'), ('RANKING_A', '<f8'), ('RANKING_B', '<f8')]
In [91]: np.array([tuple(row) for row in subset.tolist()],dtype=dt)
Out[91]: 
array([(2.0, 12, 33.33333333, 2.0, 33.33333333, 12.0),
       (2.0, 2, 33.33333333, 2.0, 33.33333333, 2.0),
       (2.8, 8, 45.83333333, 2.75, 46.66666667, 13.0),
       (3.11320755, 75, 56.0, 3.24, 52.83018868, 33.0)], 
      dtype=[('PERCENT_A_NEW', '<f8'), ('JoinField', '<i4'), ('NULL_COUNT_B', '<f8'), ('PERCENT_COMP_B', '<f8'), ('RANKING_A', '<f8'), ('RANKING_B', '<f8')])

numpy数组到ndarray

问题描述

1 个解决方案

解决方案1
2 已采纳 2016-11-17 17:09:15

numpy数组到ndarray

问题描述

1 个解决方案

解决方案1 2 已采纳 2016-11-17 17:09:15

解决方案1
2 已采纳 2016-11-17 17:09:15