[英]pandas how to convert all the string value to float
I want to convert all the string
value in Pandas DataFrame
into float
, and I can define a short function to do this, but it's not a Pythonic way to do that. 我想将
Pandas DataFrame
所有string
值转换为float
,我可以定义一个短函数来执行此操作,但这不是Pythonic方法。 My DataFrame looks like this: 我的DataFrame看起来像这样:
>>> df = pd.DataFrame(np.array([['1', '2', '3'], ['4', '5', '6']]))
>>> df
0 1 2
0 1 2 3
1 4 5 6
>>> df.dtypes
0 object
1 object
2 object
dtype: object
>>> type(df[0][0])
<type 'str'>
I just wonder whether are there some built-in functions of Pandas DataFrame
to convert all the string
value to float
. 我只是想知道是否有一些
Pandas DataFrame
内置函数将所有string
值转换为float
。 If you know the built-in function on the Pandas doc, please post the link. 如果你知道Pandas doc上的内置函数,请发布链接。
Another option is to use df.convert_objects(numeric=True)
. 另一种选择是使用
df.convert_objects(numeric=True)
。 It attempts to convert numeric strings to numbers, with unconvertible values becoming NaN: 它试图将数字字符串转换为数字,不可转换的值变为NaN:
import pandas as pd
df = pd.DataFrame([['1', '2', '3'], ['4', '5', 'foo'], ['bar', 'baz', 'quux']])
df = df.convert_objects(convert_numeric=True)
print(df)
yields 产量
0 1 2
0 1 2 3
1 4 5 NaN
2 NaN NaN NaN
In contrast, df.astype(float)
would raise ValueError: could not convert string to float: quux
since in the above DataFrame some strings (such as 'quux'
) is not numeric. 相反,
df.astype(float)
会引发ValueError: could not convert string to float: quux
因为在上面的DataFrame中,某些字符串(例如'quux'
)不是数字。
Note: in future versions of pandas (after 0.16.2) the function argument will be numeric=True
instead of convert_numeric=True
. 注意:在pandas的未来版本中(在0.16.2之后),函数参数将是
numeric=True
而不是convert_numeric=True
。
Assuming all values can be correctly converted to float, you can use DataFrame.astype()
function to convert the type of complete dataframe to float. 假设所有值都可以正确转换为float,您可以使用
DataFrame.astype()
函数将完整数据帧的类型转换为float。 Example - 示例 -
df = df.astype(float)
Demo - 演示 -
In [5]: df = pd.DataFrame(np.array([['1', '2', '3'], ['4', '5', '6']]))
In [6]: df.astype(float)
Out[6]:
0 1 2
0 1 2 3
1 4 5 6
In [7]: df = df.astype(float)
In [8]: df.dtypes
Out[8]:
0 float64
1 float64
2 float64
dtype: object
.astype()
function also has a raise_on_error
argument (which defaults to True) which you can set to False
to make it ignore errors . .astype()
函数还有一个raise_on_error
参数(默认为True),您可以将其设置为False
以使其忽略错误。 In such cases, the original value is used in the DataFrame - 在这种情况下,原始值在DataFrame中使用 -
In [10]: df = pd.DataFrame([['1', '2', '3'], ['4', '5', '6'],['blah','bloh','bleh']])
In [11]: df.astype(float,raise_on_error=False)
Out[11]:
0 1 2
0 1 2 3
1 4 5 6
2 blah bloh bleh
To convert just a series/column to float, again assuming all values can be converted, you can use [Series.astype()][2]
. 要将一个系列/列转换为float,再次假设所有值都可以转换,您可以使用
[Series.astype()][2]
。 Example - 示例 -
df['somecol'] = df['somecol'].astype(<type>)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.