简体   繁体   English

Python用逗号千位分隔符替换中间数字

[英]Python replace middle digits with commas thousand separator

I have a string like this: 我有一个像这样的字符串:

123456789.123456789-123456789

Before and after the decimal/hyphen there can be any number of digits, what I need to do is remove everything before the decimal including the decimal and remove the hyphen and everything after the hyphen. 在十进制/连字符之前和之后可以有任意数量的数字,我需要做的是删除小数点之前的所有内容,包括小数,并删除连字符和连字符后的所有内容。 Then with the middle group of digits (that I need to keep) I need to place a comma thousands separators. 然后使用中间的数字组(我需要保留),我需要放置一个逗号数千个分隔符。

So here the output would be: 所以这里的输出是:

123,456,789

I can use lookarounds to capture the digits in the middle but then it wont replace the other digits and i'm not sure how to place commas using lookarounds. 我可以使用lookarounds捕获中间的数字,但它不会取代其他数字,我不知道如何使用lookarounds放置逗号。

(?<=\.)\d+(?=-)

Then I figured I could use a capturing group like so which will work, but not sure how to insert the comma's 然后我想我可以使用像这样的捕获组,它将起作用,但不知道如何插入逗号

\d+\.(\d+)-\d+

How could I insert comma's using one of the above regex? 我如何使用上述正则表达式之一插入逗号?

Don't try to insert the thousands separators with a regex; 不要试图用正则表达式插入千位分隔符; just pick out that middle number and use a function to produce the replacement; 只需挑出中间数字并使用函数来产生替换; re.sub() accepts a function as replacement pattern: re.sub()接受一个函数作为替换模式:

re.sub(r'\d+\.(\d+)-\d+', lambda m: format(int(m.group(1)), ','), inputtext)

The , format for integers when used in the format() function handles formatting a number to one with thousands separators: ,对于整数格式中使用时, format()函数处理格式化一个数字一个与千位分隔符:

>>> import re
>>> inputtext = '123456789.123456789-123456789'
>>> re.sub(r'\d+\.(\d+)-\d+', lambda m: format(int(m.group(1)), ','), inputtext)
'123,456,789'

This will of course still work in a larger body of text containing the number, dot, number, dash, number sequence. 这当然仍然适用于包含数字,点,数字,短划线,数字序列的更大文本。

The format() function is closely related to the str.format() method but doesn't require a full string template (so no {} placeholder or field names required). format()函数与str.format()方法密切相关,但不需要完整的字符串模板(因此不需要{}占位符或字段名称)。

You've asked for a full regular expression here, It would probably be easier to split your string.. 你在这里要求一个完整的正则表达式,分割你的字符串可能会更容易。

>>> import re
>>> s = '123456789.123456789-123456789'
>>> '{:,}'.format(int(re.split('[.-]', s)[1]))
123,456,789

If you prefer using regular expression, use a function call or lambda in the replacement: 如果您更喜欢使用正则表达式,请在替换中使用函数调用或lambda:

>>> import re
>>> s = '123456789.123456789-123456789'
>>> re.sub(r'\d+\.(\d+)-\d+', lambda m: '{:,}'.format(int(m.group(1))), s)
123,456,789

You can take a look at the different format specifications. 您可以查看不同的格式规范。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM