简体   繁体   English

pandas如何将所有字符串值转换为float

[英]pandas how to convert all the string value to float

I want to convert all the string value in Pandas DataFrame into float , and I can define a short function to do this, but it's not a Pythonic way to do that. 我想将Pandas DataFrame所有string值转换为float ,我可以定义一个短函数来执行此操作,但这不是Pythonic方法。 My DataFrame looks like this: 我的DataFrame看起来像这样:

>>> df = pd.DataFrame(np.array([['1', '2', '3'], ['4', '5', '6']]))
>>> df
   0  1  2
0  1  2  3
1  4  5  6
>>> df.dtypes
0    object
1    object
2    object
dtype: object
>>> type(df[0][0])
<type 'str'>

I just wonder whether are there some built-in functions of Pandas DataFrame to convert all the string value to float . 我只是想知道是否有一些Pandas DataFrame内置函数将所有string值转换为float If you know the built-in function on the Pandas doc, please post the link. 如果你知道Pandas doc上的内置函数,请发布链接。

Another option is to use df.convert_objects(numeric=True) . 另一种选择是使用df.convert_objects(numeric=True) It attempts to convert numeric strings to numbers, with unconvertible values becoming NaN: 它试图将数字字符串转换为数字,不可转换的值变为NaN:

import pandas as pd

df = pd.DataFrame([['1', '2', '3'], ['4', '5', 'foo'], ['bar', 'baz', 'quux']])
df = df.convert_objects(convert_numeric=True)
print(df)

yields 产量

    0   1   2
0   1   2   3
1   4   5 NaN
2 NaN NaN NaN

In contrast, df.astype(float) would raise ValueError: could not convert string to float: quux since in the above DataFrame some strings (such as 'quux' ) is not numeric. 相反, df.astype(float)会引发ValueError: could not convert string to float: quux因为在上面的DataFrame中,某些字符串(例如'quux' )不是数字。

Note: in future versions of pandas (after 0.16.2) the function argument will be numeric=True instead of convert_numeric=True . 注意:在pandas的未来版本中(在0.16.2之后),函数参数将是numeric=True而不是convert_numeric=True

Assuming all values can be correctly converted to float, you can use DataFrame.astype() function to convert the type of complete dataframe to float. 假设所有值都可以正确转换为float,您可以使用DataFrame.astype()函数将完整数据帧的类型转换为float。 Example - 示例 -

df = df.astype(float)

Demo - 演示 -

In [5]: df = pd.DataFrame(np.array([['1', '2', '3'], ['4', '5', '6']]))

In [6]: df.astype(float)
Out[6]:
   0  1  2
0  1  2  3
1  4  5  6

In [7]: df = df.astype(float)

In [8]: df.dtypes
Out[8]:
0    float64
1    float64
2    float64
dtype: object

.astype() function also has a raise_on_error argument (which defaults to True) which you can set to False to make it ignore errors . .astype()函数还有一个raise_on_error参数(默认为True),您可以将其设置为False以使其忽略错误。 In such cases, the original value is used in the DataFrame - 在这种情况下,原始值在DataFrame中使用 -

In [10]: df = pd.DataFrame([['1', '2', '3'], ['4', '5', '6'],['blah','bloh','bleh']])

In [11]: df.astype(float,raise_on_error=False)
Out[11]:
      0     1     2
0     1     2     3
1     4     5     6
2  blah  bloh  bleh

To convert just a series/column to float, again assuming all values can be converted, you can use [Series.astype()][2] . 要将一个系列/列转换为float,再次假设所有值都可以转换,您可以使用[Series.astype()][2] Example - 示例 -

df['somecol'] = df['somecol'].astype(<type>)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM