I'm trying to convert the .values that I have into an array that has a function within it, but keep on coming up with an error. Would appreciate the help!
Here is the .values:
Y = df['GDP_growth'].values
array(['3.299991384', '-1.760010328', '5.155440545', '4.019541839',
'0.801760179', '7.200000003', '3.727818428', '0.883846197'], dtype-object)
Here is the command to make the array that comes out as an error:
Y = np.array([1 if y>= 3 else 0 for y in Y])
In my case, the error is that it all comes out as 1.
You could use numpy filtering, but first you need to change type from str
or object
to float
or np.float
as you need:
import numpy as np
Y = np.array(['3.299991384', '-1.760010328', '5.155440545', '4.019541839',
'0.801760179', '7.200000003', '3.727818428', '0.883846197'], dtype=object)
Y = Y.astype(float)
Y[Y<=3] = 0
Y[Y>3] = 1
In [67]: Y
Out[67]: array([ 1., 0., 1., 1., 0., 1., 1., 0.])
EDIT
If you need some preprocessing to convert your data to numbers values you could use to_numeric
and then dropna
to the interesting series or to whole dataframe , ie for series:
z = pd.Series(Y)
z[0] = 'a'
In [293]: z
Out[293]:
0 a
1 -1.760010328
2 5.155440545
3 4.019541839
4 0.801760179
5 7.200000003
6 3.727818428
7 0.883846197
dtype: object
pd.to_numeric(z, errors='coerce').dropna()
In [296]: pd.to_numeric(z, errors='coerce').dropna()
Out[296]:
1 -1.760010
2 5.155441
3 4.019542
4 0.801760
5 7.200000
6 3.727818
7 0.883846
dtype: float64
Figured it out! Apparently I had some missing values denoted as '..', so I had to wrangle it out first by dropping those rows - then I can apply .astype
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.