简体   繁体   English

检查字符串是否可以在 Python 中转换为浮点数

[英]Checking if a string can be converted to float in Python

I've got some Python code that runs through a list of strings and converts them to integers or floating point numbers if possible.我有一些 Python 代码,它们通过字符串列表运行,并尽可能将它们转换为整数或浮点数。 Doing this for integers is pretty easy对整数执行此操作非常容易

if element.isdigit():
  newelement = int(element)

Floating point numbers are more difficult.浮点数更难。 Right now I'm using partition('.') to split the string and checking to make sure that one or both sides are digits.现在我正在使用partition('.')来拆分字符串并检查以确保一侧或两侧都是数字。

partition = element.partition('.')
if (partition[0].isdigit() and partition[1] == '.' and partition[2].isdigit()) 
    or (partition[0] == '' and partition[1] == '.' and partition[2].isdigit()) 
    or (partition[0].isdigit() and partition[1] == '.' and partition[2] == ''):
  newelement = float(element)

This works, but obviously the if statement for that is a bit of a bear.这行得通,但显然 if 语句有点熊。 The other solution I considered is to just wrap the conversion in a try/catch block and see if it succeeds, as described in this question .我考虑的另一个解决方案是将转换包装在 try/catch 块中,看看它是否成功,如本问题所述。

Anyone have any other ideas?有人有其他想法吗? Opinions on the relative merits of the partition and try/catch approaches?关于分区和 try/catch 方法的相对优点的意见?

I would just use..我只会用..

try:
    float(element)
except ValueError:
    print "Not a float"

..it's simple, and it works. ..这很简单,而且有效。 Note that it will still throw OverflowError if element is eg 1<<1024.请注意,如果元素是例如 1<<1024,它仍然会抛出 OverflowError。

Another option would be a regular expression:另一种选择是正则表达式:

import re
if re.match(r'^-?\d+(?:\.\d+)$', element) is None:
    print "Not float"

Python method to check for float:检查浮点数的 Python 方法:

def is_float(element: Any) -> bool:
    try:
        float(element)
        return True
    except ValueError:
        return False

Always do unit testing.始终进行单元测试。 What is, and is not a float may surprise you:什么是浮动,什么不是浮动可能会让您感到惊讶:

Command to parse                        Is it a float?  Comment
--------------------------------------  --------------- ------------
print(isfloat(""))                      False
print(isfloat("1234567"))               True 
print(isfloat("NaN"))                   True        nan is also float
print(isfloat("NaNananana BATMAN"))     False
print(isfloat("123.456"))               True
print(isfloat("123.E4"))                True
print(isfloat(".1"))                    True
print(isfloat("1,234"))                 False
print(isfloat("NULL"))                  False       case insensitive
print(isfloat(",1"))                    False           
print(isfloat("123.EE4"))               False           
print(isfloat("6.523537535629999e-07")) True
print(isfloat("6e777777"))              True        This is same as Inf
print(isfloat("-iNF"))                  True
print(isfloat("1.797693e+308"))         True
print(isfloat("infinity"))              True
print(isfloat("infinity and BEYOND"))   False
print(isfloat("12.34.56"))              False       Two dots not allowed.
print(isfloat("#56"))                   False
print(isfloat("56%"))                   False
print(isfloat("0E0"))                   True
print(isfloat("x86E0"))                 False
print(isfloat("86-5"))                  False
print(isfloat("True"))                  False       Boolean is not a float.   
print(isfloat(True))                    True        Boolean is a float
print(isfloat("+1e1^5"))                False
print(isfloat("+1e1"))                  True
print(isfloat("+1e1.3"))                False
print(isfloat("+1.3P1"))                False
print(isfloat("-+1"))                   False
print(isfloat("(1)"))                   False       brackets not interpreted
'1.43'.replace('.','',1).isdigit()

which will return true only if there is one or no '.'仅当有一个或没有“。”时才会返回true in the string of digits.在数字串中。

'1.4.3'.replace('.','',1).isdigit()

will return false将返回false

'1.ww'.replace('.','',1).isdigit()

will return false将返回false

TL;DR : TL;博士

  • If your input is mostly strings that can be converted to floats, the try: except: method is the best native Python method.如果您的输入主要是可以转换为浮点数的字符串,那么try: except:方法是最好的原生 Python 方法。
  • If your input is mostly strings that cannot be converted to floats, regular expressions or the partition method will be better.如果您的输入主要是无法转换为浮点数的字符串,则正则表达式或分区方法会更好。
  • If you are 1) unsure of your input or need more speed and 2) don't mind and can install a third-party C-extension, fastnumbers works very well.如果您 1) 不确定您的输入或需要更快的速度,并且 2) 不介意并且可以安装第三方 C 扩展,那么 fastnumbers可以很好地工作。

There is another method available via a third-party module called fastnumbers (disclosure, I am the author);通过名为fastnumbers的第三方模块还有另一种方法可用(披露,我是作者); it provides a function called isfloat .它提供了一个名为isfloat的函数。 I have taken the unittest example outlined by Jacob Gabrielson in this answer , but added the fastnumbers.isfloat method.我在这个答案中采用了 Jacob Gabrielson 概述的 unittest 示例,但添加了fastnumbers.isfloat方法。 I should also note that Jacob's example did not do justice to the regex option because most of the time in that example was spent in global lookups because of the dot operator... I have modified that function to give a fairer comparison to try: except: .我还应该注意,Jacob 的示例对正则表达式选项不公平,因为由于点运算符,该示例中的大部分时间都花在全局查找中......我已经修改了该函数以提供更公平的比较来try: except: .


def is_float_try(str):
    try:
        float(str)
        return True
    except ValueError:
        return False

import re
_float_regexp = re.compile(r"^[-+]?(?:\b[0-9]+(?:\.[0-9]*)?|\.[0-9]+\b)(?:[eE][-+]?[0-9]+\b)?$").match
def is_float_re(str):
    return True if _float_regexp(str) else False

def is_float_partition(element):
    partition=element.partition('.')
    if (partition[0].isdigit() and partition[1]=='.' and partition[2].isdigit()) or (partition[0]=='' and partition[1]=='.' and partition[2].isdigit()) or (partition[0].isdigit() and partition[1]=='.' and partition[2]==''):
        return True
    else:
        return False

from fastnumbers import isfloat


if __name__ == '__main__':
    import unittest
    import timeit

    class ConvertTests(unittest.TestCase):

        def test_re_perf(self):
            print
            print 're sad:', timeit.Timer('ttest.is_float_re("12.2x")', "import ttest").timeit()
            print 're happy:', timeit.Timer('ttest.is_float_re("12.2")', "import ttest").timeit()

        def test_try_perf(self):
            print
            print 'try sad:', timeit.Timer('ttest.is_float_try("12.2x")', "import ttest").timeit()
            print 'try happy:', timeit.Timer('ttest.is_float_try("12.2")', "import ttest").timeit()

        def test_fn_perf(self):
            print
            print 'fn sad:', timeit.Timer('ttest.isfloat("12.2x")', "import ttest").timeit()
            print 'fn happy:', timeit.Timer('ttest.isfloat("12.2")', "import ttest").timeit()


        def test_part_perf(self):
            print
            print 'part sad:', timeit.Timer('ttest.is_float_partition("12.2x")', "import ttest").timeit()
            print 'part happy:', timeit.Timer('ttest.is_float_partition("12.2")', "import ttest").timeit()

    unittest.main()

On my machine, the output is:在我的机器上,输出是:

fn sad: 0.220988988876
fn happy: 0.212214946747
.
part sad: 1.2219619751
part happy: 0.754667043686
.
re sad: 1.50515985489
re happy: 1.01107215881
.
try sad: 2.40243887901
try happy: 0.425730228424
.
----------------------------------------------------------------------
Ran 4 tests in 7.761s

OK

As you can see, regex is actually not as bad as it originally seemed, and if you have a real need for speed, the fastnumbers method is quite good.如您所见,正则表达式实际上并不像最初看起来那么糟糕,如果您真的需要速度, fastnumbers方法是相当不错的。

Just for variety here is another method to do it.只是为了多样化,这是另一种方法。

>>> all([i.isnumeric() for i in '1.2'.split('.',1)])
True
>>> all([i.isnumeric() for i in '2'.split('.',1)])
True
>>> all([i.isnumeric() for i in '2.f'.split('.',1)])
False

Edit: Im sure it will not hold up to all cases of float though especially when there is an exponent.编辑:我确信它不会支持所有的浮动情况,尤其是当有指数时。 To solve that it looks like this.为了解决这个问题,它看起来像这样。 This will return True only val is a float and False for int but is probably less performant than regex.这将返回 True,只有 val 是浮点数,而 False 表示 int,但性能可能不如正则表达式。

>>> def isfloat(val):
...     return all([ [any([i.isnumeric(), i in ['.','e']]) for i in val],  len(val.split('.')) == 2] )
...
>>> isfloat('1')
False
>>> isfloat('1.2')
True
>>> isfloat('1.2e3')
True
>>> isfloat('12e3')
False

Simplified version of the function is_digit(str) , which suffices in most cases (doesn't consider exponential notation and "NaN" value):函数is_digit(str)的简化版本,在大多数情况下就足够了(不考虑指数符号“NaN”值):

def is_digit(str):
    return str.lstrip('-').replace('.', '').isdigit()

If you cared about performance (and I'm not suggesting you should), the try-based approach is the clear winner (compared with your partition-based approach or the regexp approach), as long as you don't expect a lot of invalid strings, in which case it's potentially slower (presumably due to the cost of exception handling).如果您关心性能(我不建议您这样做),那么基于尝试的方法显然是赢家(与基于分区的方法或正则表达式方法相比),只要您不期望很多无效字符串,在这种情况下它可能会变慢(可能是由于异常处理的成本)。

Again, I'm not suggesting you care about performance, just giving you the data in case you're doing this 10 billion times a second, or something.再说一次,我不是建议你关心性能,只是给你数据,以防你每秒执行 100 亿次,或者其他什么。 Also, the partition-based code doesn't handle at least one valid string.此外,基于分区的代码不处理至少一个有效字符串。

$ ./floatstr.py
F..
partition sad: 3.1102449894
partition happy: 2.09208488464
..
re sad: 7.76906108856
re happy: 7.09421992302
..
try sad: 12.1525540352
try happy: 1.44165301323
.
======================================================================
FAIL: test_partition (__main__.ConvertTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "./floatstr.py", line 48, in test_partition
    self.failUnless(is_float_partition("20e2"))
AssertionError

----------------------------------------------------------------------
Ran 8 tests in 33.670s

FAILED (failures=1)

Here's the code (Python 2.6, regexp taken from John Gietzen's answer ):这是代码(Python 2.6,正则表达式取自 John Gietzen 的答案):

def is_float_try(str):
    try:
        float(str)
        return True
    except ValueError:
        return False

import re
_float_regexp = re.compile(r"^[-+]?(?:\b[0-9]+(?:\.[0-9]*)?|\.[0-9]+\b)(?:[eE][-+]?[0-9]+\b)?$")
def is_float_re(str):
    return re.match(_float_regexp, str)


def is_float_partition(element):
    partition=element.partition('.')
    if (partition[0].isdigit() and partition[1]=='.' and partition[2].isdigit()) or (partition[0]=='' and partition[1]=='.' and pa\
rtition[2].isdigit()) or (partition[0].isdigit() and partition[1]=='.' and partition[2]==''):
        return True

if __name__ == '__main__':
    import unittest
    import timeit

    class ConvertTests(unittest.TestCase):
        def test_re(self):
            self.failUnless(is_float_re("20e2"))

        def test_try(self):
            self.failUnless(is_float_try("20e2"))

        def test_re_perf(self):
            print
            print 're sad:', timeit.Timer('floatstr.is_float_re("12.2x")', "import floatstr").timeit()
            print 're happy:', timeit.Timer('floatstr.is_float_re("12.2")', "import floatstr").timeit()

        def test_try_perf(self):
            print
            print 'try sad:', timeit.Timer('floatstr.is_float_try("12.2x")', "import floatstr").timeit()
            print 'try happy:', timeit.Timer('floatstr.is_float_try("12.2")', "import floatstr").timeit()

        def test_partition_perf(self):
            print
            print 'partition sad:', timeit.Timer('floatstr.is_float_partition("12.2x")', "import floatstr").timeit()
            print 'partition happy:', timeit.Timer('floatstr.is_float_partition("12.2")', "import floatstr").timeit()

        def test_partition(self):
            self.failUnless(is_float_partition("20e2"))

        def test_partition2(self):
            self.failUnless(is_float_partition(".2"))

        def test_partition3(self):
            self.failIf(is_float_partition("1234x.2"))

    unittest.main()

If you don't need to worry about scientific or other expressions of numbers and are only working with strings that could be numbers with or without a period:如果您不需要担心数字的科学或其他表达式,并且只使用可能是带句点或不带句点的数字的字符串:

Function功能

def is_float(s):
    result = False
    if s.count(".") == 1:
        if s.replace(".", "").isdigit():
            result = True
    return result

Lambda version Lambda 版本

is_float = lambda x: x.replace('.','',1).isdigit() and "." in x

Example例子

if is_float(some_string):
    some_string = float(some_string)
elif some_string.isdigit():
    some_string = int(some_string)
else:
    print "Does not convert to int or float."

This way you aren't accidentally converting what should be an int, into a float.这样您就不会意外地将应该是 int 的内容转换为 float。

This regex will check for scientific floating point numbers:此正则表达式将检查科学浮点数:

^[-+]?(?:\b[0-9]+(?:\.[0-9]*)?|\.[0-9]+\b)(?:[eE][-+]?[0-9]+\b)?$

However, I believe that your best bet is to use the parser in a try.但是,我相信您最好的选择是尝试使用解析器。

I used the function already mentioned, but soon I notice that strings as "Nan", "Inf" and it's variation are considered as number.我使用了已经提到的函数,但很快我注意到字符串为“Nan”、“Inf”及其变体被视为数字。 So I propose you improved version of the function, that will return false on those type of input and will not fail "1e3" variants:因此,我建议您改进该函数的版本,该版本将在这些类型的输入上返回 false 并且不会使“1e3”变体失败:

def is_float(text):
    # check for nan/infinity etc.
    if text.isalpha():
        return False
    try:
        float(text)
        return True
    except ValueError:
        return False

You can use the try - except - else clause , this will catch any conversion/ value errors raised when the value passed cannot be converted to a float您可以使用try - except - else子句,这将捕获当传递的值无法转换为浮点数时引发的任何转换/值错误


  def try_parse_float(item):
      result = None
      try:
        float(item)
      except:
        pass
      else:
        result = float(item)
      return result

a simple function that get you the type of number without try and except operation一个简单的函数,无需尝试和除操作即可获得数字类型

def number_type(number):
    if number.isdigit():
        return int(number)
    elif number.replace(".","").isdigit():
        return float(number)
    else:
        return(type(number))

I was looking for some similar code, but it looks like using try/excepts is the best way.我一直在寻找一些类似的代码,但看起来使用 try/excepts 是最好的方法。 Here is the code I'm using.这是我正在使用的代码。 It includes a retry function if the input is invalid.如果输入无效,它包括重试功能。 I needed to check if the input was greater than 0 and if so convert it to a float.我需要检查输入是否大于 0,如果是,则将其转换为浮点数。

def cleanInput(question,retry=False): 
    inputValue = input("\n\nOnly positive numbers can be entered, please re-enter the value.\n\n{}".format(question)) if retry else input(question)
    try:
        if float(inputValue) <= 0 : raise ValueError()
        else : return(float(inputValue))
    except ValueError : return(cleanInput(question,retry=True))


willbefloat = cleanInput("Give me the number: ")

Try to convert to float.尝试转换为浮动。 If there is an error, print the ValueError exception.如果有错误,打印 ValueError 异常。

try:
    x = float('1.23')
    print('val=',x)
    y = float('abc')
    print('val=',y)
except ValueError as err:
    print('floatErr;',err)

Output:输出:

val= 1.23
floatErr: could not convert string to float: 'abc'

Passing dictionary as argument it will convert strings which can be converted to float and will leave others将字典作为参数传递,它将转换可以转换为浮点数的字符串并保留其他字符串

def covertDict_float(data):
        for i in data:
            if data[i].split(".")[0].isdigit():
                try:
                    data[i] = float(data[i])
                except:
                    continue
        return data

I tried some of the above simple options, using a try test around converting to a float, and found that there is a problem in most of the replies.我尝试了上面一些简单的选项,围绕转换为浮点数进行了尝试测试,发现大多数回复都存在问题。

Simple test (along the lines of above answers):简单测试(按照上述答案):

entry = ttk.Entry(self, validate='key')
entry['validatecommand'] = (entry.register(_test_num), '%P')

def _test_num(P):
    try: 
        float(P)
        return True
    except ValueError:
        return False

The problem comes when:问题出现在:

  • You enter '-' to start a negative number:您输入“-”以开始一个负数:

You are then trying float('-') which fails然后您尝试float('-')失败

  • You enter a number, but then try to delete all the digits您输入一个数字,然后尝试删除所有数字

You are then trying float('') which likewise also fails然后你正在尝试float('')这同样也失败了

The quick solution I had is:我的快速解决方案是:

def _test_num(P):
    if P == '' or P == '-': return True
    try: 
        float(P)
        return True
    except ValueError:
        return False

It seems many regex given miss one thing or another.似乎很多正则表达式都会错过一件事或另一件事。 This has been working for me so far:到目前为止,这一直对我有用:

(?i)^\s*[+-]?(?:inf(inity)?|nan|(?:\d+\.?\d*|\.\d+)(?:e[+-]?\d+)?)\s*$

It allows for infinity (or inf) with sign, nan, no digit before the decimal, and leading/trailing spaces (if desired).它允许带有符号、nan、小数点前没有数字和前导/尾随空格(如果需要)的无穷大(或 inf)。 The ^ and $ are needed to keep from partially matching something like 1.2f-2 as 1.2 .需要^$以防止将1.2f-2部分匹配为1.2

You could use [ed] instead of just e if you need to parse some files where D is used for double-precision scientific notation.如果您需要解析一些将D用于双精度科学记数法的文件,则可以使用[ed]而不是仅使用e You would want to replace it afterward or just replace them before checking since the float() function won't allow it.您可能想在之后替换它,或者只是在检查之前替换它们,因为float()函数不允许这样做。

I found a way that could also work.我找到了一种可行的方法。 need to verify this.需要验证这一点。 first time putting something here.第一次在这里放东西。

def isfloat(a_str):
    try:
        x=float(a_str)
        if x%1 == 0:
            return False
        elif x%1 != 0: #an else also do
            return True
    except Exception as error:
            return False

This works like a charm:这就像一个魅力:

[dict([a,int(x) if isinstance(x, str)
 and x.isnumeric() else float(x) if isinstance(x, str)
 and x.replace('.', '', 1).isdigit() else x] for a, x in json_data.items())][0]

I've written my own functions.我已经编写了自己的函数。 Instead of float(value), I use floatN() or floatZ().我使用 floatN() 或 floatZ() 而不是 float(value)。 which return None or 0.0 if the value can't be cast as a float.如果不能将值转换为浮点数,则返回 None 或 0.0。 I keep them in a module I've called safeCasts.我将它们保存在我称为 safeCast 的模块中。

def floatN(value):
    try:
        if value is not None:
            fvalue = float(value)
        else:
            fvalue = None
    except ValueError:
        fvalue = None

    return fvalue


def floatZ(value):
    try:
        if value is not None:
            fvalue = float(value)
        else:
            fvalue = 0.0
    except ValueError:
        fvalue = 0.0

    return fvalue

In other modules I import them在其他模块中,我导入它们

from safeCasts import floatN, floatZ

then use floatN(value) or floatZ(value) instead of float().然后使用 floatN(value) 或 floatZ(value) 而不是 float()。 Obviously, you can use this technique for any cast function you need.显然,您可以将此技术用于您需要的任何强制转换功能。

It's a simple, yet interesting question.这是一个简单但有趣的问题。 Solution presented below works fine for me:下面提出的解决方案对我来说很好:

import re

val = "25,000.93$"

regex = r"\D"

splitted = re.split(regex, val)
splitted = list(filter(str.isdecimal, splitted))

if splitted:
    if len(splitted) > 1:
        splitted.insert(-1, ".")

    try:
        f = float("".join(splitted))
        print(f, "is float.")
        
    except ValueError:
        print("Not a float.")
        
else:
    print("Not a float.")

Important note : this solution is based on assumption that the last value in splitted contains decimal places.重要说明:此解决方案基于splitted中的最后一个值包含小数位的假设。

You can create a function isfloat(), and use in place of isdigit() for both integers and floats, but not strings as you expect.您可以创建一个函数 isfloat(),并用 isdigit() 代替整数和浮点数,但不能像您期望的那样使用字符串。

a = raw_input('How much is 1 share in that company? \n')

def isfloat(num):
    try:
        float(num)
        return True
    except:
        return False
       
while not isfloat(a):
    print("You need to write a number!\n")
    a = raw_input('How much is 1 share in that company? \n')

We can use regex as: import re if re.match('[0-9]*.?[0-9]+', <your_string>): print("Its a float/int") else: print("Its something alien") let me explain the regex in english,我们可以将正则表达式用作: import re if re.match('[0-9]*.?[0-9]+', <your_string>): print("Its a float/int") else: print("Its something alien")让我用英语解释正则表达式,

  • * -> 0 or more occurence * -> 0 次或多次出现
  • + -> 1 or more occurence + -> 1 次或多次出现
  • ? ? -> 0/1 occurence -> 0/1 出现

now, lets convert现在,让我们转换

  • '[0-9]* -> let there be 0 or more occurence of digits in between 0-9 '[0-9]* -> 让 0-9 之间出现 0 个或多个数字
  • \.? \.? -> followed by a 0 or one '.'(if you need to check if it can be int/float else we can also use instead of ?, use {1}) -> 后跟一个 0 或一个 '.'(如果您需要检查它是否可以是 int/float 否则我们也可以使用代替 ?,使用 {1})
  • [0-9]+ -> followed by 0 or more occurence of digits in between 0-9 [0-9]+ -> 后跟 0-9 之间出现 0 个或多个数字
str(strval).isdigit()

seems to be simple.似乎很简单。

Handles values stored in as a string or int or float处理存储为字符串或 int 或 float 的值

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM