[英]Python replace middle digits with commas thousand separator
I have a string like this: 我有一个像这样的字符串:
123456789.123456789-123456789
Before and after the decimal/hyphen there can be any number of digits, what I need to do is remove everything before the decimal including the decimal and remove the hyphen and everything after the hyphen. 在十进制/连字符之前和之后可以有任意数量的数字,我需要做的是删除小数点之前的所有内容,包括小数,并删除连字符和连字符后的所有内容。 Then with the middle group of digits (that I need to keep) I need to place a comma thousands separators. 然后使用中间的数字组(我需要保留),我需要放置一个逗号数千个分隔符。
So here the output would be: 所以这里的输出是:
123,456,789
I can use lookarounds to capture the digits in the middle but then it wont replace the other digits and i'm not sure how to place commas using lookarounds. 我可以使用lookarounds捕获中间的数字,但它不会取代其他数字,我不知道如何使用lookarounds放置逗号。
(?<=\.)\d+(?=-)
Then I figured I could use a capturing group like so which will work, but not sure how to insert the comma's 然后我想我可以使用像这样的捕获组,它将起作用,但不知道如何插入逗号
\d+\.(\d+)-\d+
How could I insert comma's using one of the above regex? 我如何使用上述正则表达式之一插入逗号?
Don't try to insert the thousands separators with a regex; 不要试图用正则表达式插入千位分隔符; just pick out that middle number and use a function to produce the replacement; 只需挑出中间数字并使用函数来产生替换; re.sub()
accepts a function as replacement pattern: re.sub()
接受一个函数作为替换模式:
re.sub(r'\d+\.(\d+)-\d+', lambda m: format(int(m.group(1)), ','), inputtext)
The ,
format for integers when used in the format()
function handles formatting a number to one with thousands separators: 的,
对于整数格式中使用时, format()
函数处理格式化一个数字一个与千位分隔符:
>>> import re
>>> inputtext = '123456789.123456789-123456789'
>>> re.sub(r'\d+\.(\d+)-\d+', lambda m: format(int(m.group(1)), ','), inputtext)
'123,456,789'
This will of course still work in a larger body of text containing the number, dot, number, dash, number sequence. 这当然仍然适用于包含数字,点,数字,短划线,数字序列的更大文本。
The format()
function is closely related to the str.format()
method but doesn't require a full string template (so no {}
placeholder or field names required). format()
函数与str.format()
方法密切相关,但不需要完整的字符串模板(因此不需要{}
占位符或字段名称)。
You've asked for a full regular expression here, It would probably be easier to split your string.. 你在这里要求一个完整的正则表达式,分割你的字符串可能会更容易。
>>> import re
>>> s = '123456789.123456789-123456789'
>>> '{:,}'.format(int(re.split('[.-]', s)[1]))
123,456,789
If you prefer using regular expression, use a function call or lambda in the replacement: 如果您更喜欢使用正则表达式,请在替换中使用函数调用或lambda:
>>> import re
>>> s = '123456789.123456789-123456789'
>>> re.sub(r'\d+\.(\d+)-\d+', lambda m: '{:,}'.format(int(m.group(1))), s)
123,456,789
You can take a look at the different format specifications. 您可以查看不同的格式规范。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.