简体   繁体   中英

How to convert a string to a number if it has commas in it as thousands separators?

I have a string that represents a number which uses commas to separate thousands. How can I convert this to a number in python?

>>> int("1,000,000")

Generates a ValueError .

I could replace the commas with empty strings before I try to convert it, but that feels wrong somehow. Is there a better way?

import locale
locale.setlocale( locale.LC_ALL, 'en_US.UTF-8' ) 
locale.atoi('1,000,000')
# 1000000
locale.atof('1,000,000.53')
# 1000000.53

There are several ways to parse numbers with thousands separators. And I doubt that the way described by @unutbu is the best in all cases. That's why I list other ways too.

  1. The proper place to call setlocale() is in __main__ module. It's global setting and will affect the whole program and even C extensions (although note that LC_NUMERIC setting is not set at system level, but is emulated by Python). Read caveats in documentation and think twice before going this way. It's probably OK in single application, but never use it in libraries for wide audience. Probably you shoud avoid requesting locale with some particular charset encoding, since it might not be available on some systems.

  2. Use one of third party libraries for internationalization. For example PyICU allows using any available locale wihtout affecting the whole process (and even parsing numbers with particular thousands separators without using locales):

    NumberFormat.createInstance(Locale('en_US')).parse("1,000,000").getLong()

  3. Write your own parsing function, if you don't what to install third party libraries to do it "right way". It can be as simple as int(data.replace(',', '')) when strict validation is not needed.

Replace the commas with empty strings, and turn the resulting string into an int or a float .

>>> a = '1,000,000'
>>> int(a.replace(',' , ''))
1000000
>>> float(a.replace(',' , ''))
1000000.0

I got locale error from accepted answer, but the following change works here in Finland (Windows XP):

import locale
locale.setlocale( locale.LC_ALL, 'english_USA' )
print locale.atoi('1,000,000')
# 1000000
print locale.atof('1,000,000.53')
# 1000000.53

This works:

(A dirty but quick way)

>>> a='-1,234,567,89.0123'
>>> "".join(a.split(","))
'-123456789.0123'

I tried this. It goes a bit beyond the question: You get an input. It will be converted to string first (if it is a list, for example from Beautiful soup); then to int, then to float.

It goes as far as it can get. In worst case, it returns everything unconverted as string.

def to_normal(soupCell):
    ''' converts a html cell from beautiful soup to text, then to int, then to float: as far as it gets.
    US thousands separators are taken into account.
    needs import locale'''
    
    locale.setlocale( locale.LC_ALL, 'english_USA' ) 

    output = unicode(soupCell.findAll(text=True)[0].string)
    try: 
        return locale.atoi(output)
    except ValueError: 
        try: return locale.atof(output)
        except ValueError:
            return output
>>> import locale
>>> locale.setlocale(locale.LC_ALL, "")
'en_US.UTF-8'
>>> print locale.atoi('1,000,000')
1000000
>>> print locale.atof('1,000,000.53')
1000000.53

this is done on Linux in US.

A little late, but the babel library has parse_decimal and parse_number which do exactly what you want:

from babel.numbers import parse_decimal, parse_number
parse_decimal('10,3453', locale='es_ES')
>>> Decimal('10.3453')
parse_number('20.457', locale='es_ES')
>>> 20457
parse_decimal('10,3453', locale='es_MX')
>>> Decimal('103453')

You can also pass a Locale class instead of a string:

from babel import Locale
parse_decimal('10,3453', locale=Locale('es_MX'))
>>> Decimal('103453')

如果您正在使用pandas并且您正在尝试解析包含数字的 CSV,其中包含用于千位分隔符的逗号,您可以像这样传递关键字参数thousands=','

df = pd.read_csv('your_file.csv', thousands=',')

Try this:

def changenum(data):
    foo = ""
    for i in list(data):
        if i == ",":
            continue
        else:
            foo += i
    return  float(int(foo))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM