I have an exported pandas dataframe that is now a numpy.array object.
subset = array[:4,:]
array([[ 2. , 12. , 33.33333333, 2. ,
33.33333333, 12. ],
[ 2. , 2. , 33.33333333, 2. ,
33.33333333, 2. ],
[ 2.8 , 8. , 45.83333333, 2.75 ,
46.66666667, 13. ],
[ 3.11320755, 75. , 56. , 3.24 ,
52.83018868, 33. ]])
print subset.dtype
dtype('float64')
I was to convert the column values to specific types, and set column names as well, this means I need to convert it to a ndarray.
Here are my dtypes:
[('PERCENT_A_NEW', '<f8'), ('JoinField', '<i4'), ('NULL_COUNT_B', '<f8'),
('PERCENT_COMP_B', '<f8'), ('RANKING_A', '<f8'), ('RANKING_B', '<f8'),
('NULL_COUNT_B', '<f8')]
When I go to convert the array, I get:
ValueError: new type not compatible with array.
How do you cast each column to a specific value so I can convert the array to an ndarray?
Thanks
You already have an ndarray
. What you are seeking is a structured array, one with this compound dtype. First see if pandas
can do it for you. If that fails we might be able to do something with tolist
and a list comprehension.
In [84]: dt=[('PERCENT_A_NEW', '<f8'), ('JoinField', '<i4'), ('NULL_COUNT_B', '<
...: f8'),
...: ('PERCENT_COMP_B', '<f8'), ('RANKING_A', '<f8'), ('RANKING_B', '<f8'),
...: ('NULL_COUNT_B', '<f8')]
In [85]: subset=np.array([[ 2. , 12. , 33.33333333, 2.
...: ,
...: 33.33333333, 12. ],
...: [ 2. , 2. , 33.33333333, 2. ,
...: 33.33333333, 2. ],
...: [ 2.8 , 8. , 45.83333333, 2.75 ,
...: 46.66666667, 13. ],
...: [ 3.11320755, 75. , 56. , 3.24 ,
...: 52.83018868, 33. ]])
In [86]: subset
Out[86]:
array([[ 2. , 12. , 33.33333333, 2. ,
33.33333333, 12. ],
[ 2. , 2. , 33.33333333, 2. ,
33.33333333, 2. ],
[ 2.8 , 8. , 45.83333333, 2.75 ,
46.66666667, 13. ],
[ 3.11320755, 75. , 56. , 3.24 ,
52.83018868, 33. ]])
Now make an array with dt
. Input for a structured array has to be a list of tuples - so I'm using tolist
and a list comprehension
In [87]: np.array([tuple(row) for row in subset.tolist()],dtype=dt)
....
ValueError: field 'NULL_COUNT_B' occurs more than once
In [88]: subset.shape
Out[88]: (4, 6)
In [89]: dt
Out[89]:
[('PERCENT_A_NEW', '<f8'),
('JoinField', '<i4'),
('NULL_COUNT_B', '<f8'),
('PERCENT_COMP_B', '<f8'),
('RANKING_A', '<f8'),
('RANKING_B', '<f8'),
('NULL_COUNT_B', '<f8')]
In [90]: dt=[('PERCENT_A_NEW', '<f8'), ('JoinField', '<i4'), ('NULL_COUNT_B', '<
...: f8'),
...: ('PERCENT_COMP_B', '<f8'), ('RANKING_A', '<f8'), ('RANKING_B', '<f8')]
In [91]: np.array([tuple(row) for row in subset.tolist()],dtype=dt)
Out[91]:
array([(2.0, 12, 33.33333333, 2.0, 33.33333333, 12.0),
(2.0, 2, 33.33333333, 2.0, 33.33333333, 2.0),
(2.8, 8, 45.83333333, 2.75, 46.66666667, 13.0),
(3.11320755, 75, 56.0, 3.24, 52.83018868, 33.0)],
dtype=[('PERCENT_A_NEW', '<f8'), ('JoinField', '<i4'), ('NULL_COUNT_B', '<f8'), ('PERCENT_COMP_B', '<f8'), ('RANKING_A', '<f8'), ('RANKING_B', '<f8')])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.