[英]Regex for matching string Python
I wanted to match the numeric values of a string: 我想匹配一个字符串的数值:
1,000 metric tonnes per contract month
Five cents ($0.05) per tonne
Five cents ($0.05) per tonne
1,000 metric tonnes per contract month
My current approach: 我目前的做法:
size = re.findall(r'(\d+(,?\d*).*?)', my_string)
What I get with my approach: 我得到的方法是:
print size
[(u'1,000', u',000')]
As you can see, the number 1
was being cut out from the second element of the list, why is that? 如您所见,数字
1
是从列表的第二个元素中删除的,为什么呢? Also, could I get a hint as to how I can match the $0.05
terms? 另外,我是否可以暗示如何匹配
$0.05
条款?
Something like this: 像这样:
>>> import re
>>> strs = """1,000 metric tonnes per contract month
Five cents ($0.05) per tonne
Five cents ($0.05) per tonne
1,000 metric tonnes per contract month"""
>>> [m.group(0) for m in re.finditer(r'\$?\d+([,.]\d+)?', strs)]
['1,000', '$0.05', '$0.05', '1,000']
Demo : http://rubular.com/r/UomzIY3SD3 演示: http : //rubular.com/r/UomzIY3SD3
re,findall()
returns a tuple of all the capturing groups for each match, and each set of normal parentheses generates one such group. re,findall()
返回每个匹配项的所有捕获组的元组,并且每组普通括号都会生成一个这样的组。 Write your regex like this: 像这样编写您的正则表达式:
size = re.findall(r'\d{1,3}(?:,\d{3})*(?:\.\d+)?', my_string)
Explanation: 说明:
\d{1,3} # One to three digits
(?:,\d{3})* # Optional thousands groups
(?:\.\d+)? # Optional decimal part
This assumes that all numbers have commas as thousands separators, ie no numbers like 1000000
. 假设所有数字都有逗号作为千位分隔符,即没有数字像
1000000
。 If you need to match those too, use 如果您也需要匹配它们,请使用
size = re.findall(r'\d+(?:,\d{3})*(?:\.\d+)?', my_string)
Why are you grouping your regex? 为什么要对正则表达式进行分组? Try this
r'\\$?\\d+,?\\d*\\.?\\d*'
试试这个
r'\\$?\\d+,?\\d*\\.?\\d*'
I would try this regex: 我会尝试此正则表达式:
r'[0-9]+(?:,[0-9]+) (?:.[0-9] )?' r'[0-9] +(?:,[0-9] +) (?:。[0-9] )?'
Add \\$? 加\\ $? at the beginning to optionally catch the $
在开始时有选择地捕获$
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.