简体   繁体   中英

How to extract a number from a string in Python?

How do you extract a number from a string to be able to manipulate it? The number could be either an int or a float . For example if the string is "flour, 100, grams" or "flour, 100.5, grams" then extracting the number 100 or 100.5 .

Code :

string  = "flour, 100, grams"
numbers = [int(x) for x in string.split(",")]
print(numbers)

Output :

Traceback (most recent call last):
  File "/Users/lewis/Documents/extracting numbers.py", line 2, in <module>
    numbers = [int(x) for x in string.split(",")]
 File "/Users/lewis/Documents/extracting numbers.py", line 2, in <listcomp>
   numbers = [int(x) for x in string.split(",")]
ValueError: invalid literal for int() with base 10: 'flour'

Given the structure of your strings, when you use str.split to split the string into a list of three strings, you should only take one of the three elements:

>>> s = "flour, 100, grams"
>>> s.split(",")
['flour', ' 100', ' grams']
>>> s.split(",")[1] # index the middle element (Python is zero-based)
' 100'

You can then use float to convert that string into a number:

>>> float(s.split(",")[1])
100.0

If you can't be as certain as to the structure of the strings, you could use re (regular expressions) to extract numbers and map to convert them all:

>>> import re
>>> map(float, re.findall(r"""\d+ # one or more digits
                              (?: # followed by...
                                  \. # a decimal point 
                                  \d+ # and another set of one or more digits
                              )? # zero or one times""",
                          "Numbers like 1.1, 2, 34 and 15.16.",
                          re.VERBOSE))
[1.1, 2.0, 34.0, 15.16]

Have you tried try except blocks around your type cast which will throw away the string flour but keep the 100

string = 'flour, 100, grams'
numbers = []

    for i in string.split(','):
    try:
        print int(i)
        numbers.append(i)
    except: pass

Write yourself a little conversion function like the one below which attempts to convert its argument first into an int , then into a float , then into a complex (just extending the example). If you wish to obtain/retain the most appropriate type for the input, then the order of attempted conversion is important as an int will successfully be converted to a float , but not vice versa, so you need to attempt to convert the input into an int first.

def convert_to_number(n):
    candidate_types = (int, float, complex)
    for t in candidate_types:
        try:
            return t(str(n))
        except ValueError:
#            pass
            print "{!r} is not {}".format(n, t)    # comment out if not debugging
    else:
        raise ValueError('{!r} can not be converted to any of: {}'.format(n, candidate_types))

>>> s = "flour, 100, grams"
>>> n = convert_to_number(s.split(',')[1])
>>> type(n)
<type 'int'>
>>> n
100

>>> s = "flour, 100.123, grams"
>>> n = convert_to_number(s.split(',')[1])
' 100.123' is not <type 'int'>
>>> type(n)
<type 'float'>
>>> n
100.123

>>> n = convert_to_number('100+20j')
'100+20j' is not <type 'int'>
'100+20j' is not <type 'float'>
>>> type(n)
<type 'complex'>
>>> n
(100+20j)

>>> n = convert_to_number('one')
'one' is not <type 'int'>
'one' is not <type 'float'>
'one' is not <type 'complex'>
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/tmp/ctn.py", line 10, in convert_to_number
    raise ValueError('{!r} can not be converted to any of: {}'.format(n, candidate_types))
ValueError: 'one' can not be converted to any of: (<type 'int'>, <type 'float'>, <type 'complex'>)

You could use regular expressions to pluck out the numeric fields from each line of input as per jonrsharpe's answer.

There is a very simple and best way to extract numbers from a string. N number of digits you can extract from a string by using following code.

-Get integer numbers -

import re
s = 'flour, 100, grams, 200HC'
print(re.findall('\d+', s))

-Get float Numbers -

import re
map(float, re.findall(r"""\d+ # one or more digits
                          (?: # followed by...
                              \. # a decimal point 
                              \d+ # and another set of one or more digits
                          )? # zero or one times""",
                      "Numbers like 1.1, 2, 34 and 15.16.",
                      re.VERBOSE))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM