[英]Converting strings to floats in a DataFrame
How to covert a DataFrame column containing strings and NaN
values to floats.如何将包含字符串和
NaN
值的 DataFrame 列转换为浮点数。 And there is another column whose values are strings and floats;还有另一列,其值是字符串和浮点数; how to convert this entire column to floats.
如何将整个列转换为浮点数。
NOTE:
pd.convert_objects
has now been deprecated.注意:
pd.convert_objects
现在已被弃用。 You should usepd.Series.astype(float)
orpd.to_numeric
as described in other answers.您应该使用
pd.Series.astype(float)
或pd.to_numeric
,如其他答案中所述。
This is available in 0.11.这在 0.11 中可用。 Forces conversion (or set's to nan) This will work even when
astype
will fail;强制转换(或设置为 nan)即使
astype
失败,这也会起作用; its also series by series so it won't convert say a complete string column它也是逐个系列的,所以它不会转换说一个完整的字符串列
In [10]: df = DataFrame(dict(A = Series(['1.0','1']), B = Series(['1.0','foo'])))
In [11]: df
Out[11]:
A B
0 1.0 1.0
1 1 foo
In [12]: df.dtypes
Out[12]:
A object
B object
dtype: object
In [13]: df.convert_objects(convert_numeric=True)
Out[13]:
A B
0 1 1
1 1 NaN
In [14]: df.convert_objects(convert_numeric=True).dtypes
Out[14]:
A float64
B float64
dtype: object
You can try df.column_name = df.column_name.astype(float)
.您可以尝试
df.column_name = df.column_name.astype(float)
。 As for the NaN
values, you need to specify how they should be converted, but you can use the .fillna
method to do it.至于
NaN
值,您需要指定它们应该如何转换,但您可以使用.fillna
方法来完成。
Example:例子:
In [12]: df
Out[12]:
a b
0 0.1 0.2
1 NaN 0.3
2 0.4 0.5
In [13]: df.a.values
Out[13]: array(['0.1', nan, '0.4'], dtype=object)
In [14]: df.a = df.a.astype(float).fillna(0.0)
In [15]: df
Out[15]:
a b
0 0.1 0.2
1 0.0 0.3
2 0.4 0.5
In [16]: df.a.values
Out[16]: array([ 0.1, 0. , 0.4])
In a newer version of pandas (0.17 and up), you can use to_numeric function.在更新版本的 pandas(0.17 及更高版本)中,您可以使用to_numeric函数。 It allows you to convert the whole dataframe or just individual columns.
它允许您转换整个数据框或仅转换单个列。 It also gives you an ability to select how to treat stuff that can't be converted to numeric values:
它还使您能够选择如何处理无法转换为数值的内容:
import pandas as pd
s = pd.Series(['1.0', '2', -3])
pd.to_numeric(s)
s = pd.Series(['apple', '1.0', '2', -3])
pd.to_numeric(s, errors='ignore')
pd.to_numeric(s, errors='coerce')
df['MyColumnName'] = df['MyColumnName'].astype('float64')
you have to replace empty strings ('') with np.nan before converting to float.在转换为浮点数之前,您必须用 np.nan 替换空字符串 ('')。 ie:
IE:
df['a']=df.a.replace('',np.nan).astype(float)
Here is an example这是一个例子
GHI Temp Power Day_Type
2016-03-15 06:00:00 -7.99999952505459e-7 18.3 0 NaN
2016-03-15 06:01:00 -7.99999952505459e-7 18.2 0 NaN
2016-03-15 06:02:00 -7.99999952505459e-7 18.3 0 NaN
2016-03-15 06:03:00 -7.99999952505459e-7 18.3 0 NaN
2016-03-15 06:04:00 -7.99999952505459e-7 18.3 0 NaN
but if this is all string values...as was in my case... Convert the desired columns to floats:但如果这是所有字符串值......就像我的情况......将所需的列转换为浮点数:
df_inv_29['GHI'] = df_inv_29.GHI.astype(float)
df_inv_29['Temp'] = df_inv_29.Temp.astype(float)
df_inv_29['Power'] = df_inv_29.Power.astype(float)
Your dataframe will now have float values :-)您的数据框现在将具有浮点值:-)
import pandas as pd导入 pandas 作为 pd
df['a'] = pd.to_numeric(df['a']) df['a'] = pd.to_numeric(df['a'])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.