简体   繁体   中英

Functions within np.array

I'm trying to convert the .values that I have into an array that has a function within it, but keep on coming up with an error. Would appreciate the help!

Here is the .values:

Y = df['GDP_growth'].values
array(['3.299991384', '-1.760010328', '5.155440545', '4.019541839',
       '0.801760179', '7.200000003', '3.727818428', '0.883846197'], dtype-object)

Here is the command to make the array that comes out as an error:

Y = np.array([1 if y>= 3 else 0 for y in Y])

In my case, the error is that it all comes out as 1.

You could use numpy filtering, but first you need to change type from str or object to float or np.float as you need:

import numpy as np
Y = np.array(['3.299991384', '-1.760010328', '5.155440545', '4.019541839',
   '0.801760179', '7.200000003', '3.727818428', '0.883846197'], dtype=object)
Y = Y.astype(float)

Y[Y<=3] = 0
Y[Y>3] = 1

In [67]: Y
Out[67]: array([ 1.,  0.,  1.,  1.,  0.,  1.,  1.,  0.])

EDIT

If you need some preprocessing to convert your data to numbers values you could use to_numeric and then dropna to the interesting series or to whole dataframe , ie for series:

z = pd.Series(Y)
z[0] = 'a'

In [293]: z
Out[293]:
0               a
1    -1.760010328
2     5.155440545
3     4.019541839
4     0.801760179
5     7.200000003
6     3.727818428
7     0.883846197
dtype: object

pd.to_numeric(z, errors='coerce').dropna() 

In [296]: pd.to_numeric(z, errors='coerce').dropna()
Out[296]:
1   -1.760010
2    5.155441
3    4.019542
4    0.801760
5    7.200000
6    3.727818
7    0.883846
dtype: float64  

Figured it out! Apparently I had some missing values denoted as '..', so I had to wrangle it out first by dropping those rows - then I can apply .astype

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM