[英]Regex match groups of digits followed or not by spaces, words
I'm trying to match with regex:我正在尝试匹配正则表达式:
101.6 x 101.6 mm
150 x 150 mm
490 x 100 x 380 mm
490 x 100 x 380 x 430 mm
280mm x 260 mm
and extract the value(digits) as separate groups.并将值(数字)提取为单独的组。 I'm using:
我在用着:
^(?P<value>[-\.\d]+)([\s]*)([x]+)
but, I want something that doesn't care about how many times the digits appear.但是,我想要一些不关心数字出现多少次的东西。
What I want to obtain as groups:我想作为团体获得什么:
101.6, 101.6, mm
150, 150, mm
490, 100, 380, mm
490, 100, 380, 430, mm
280, 260, mm
I know, can be done with split as it is, but besides the examples above, I have also other expressions that contains "x" and in there case I don't want to split.我知道,可以按原样拆分,但除了上面的示例之外,我还有其他包含“x”的表达式,在这种情况下我不想拆分。
Given that all the strings in the example data end with mm
and mm
might also optionally occur after a digit, you could match an optional occurrence and use a positive lookahead to assert that the string also ends with mm
and that what comes in between are to only allowed parts.鉴于示例数据中所有以
mm
结尾的字符串和mm
也可能选择性地出现在数字之后,您可以匹配一个可选的出现并使用正向先行断言字符串也以mm
结尾并且介于两者之间的是只允许部分。
If you want to match multiple spaces, you could use [ ]+
with the brackets in this case for clarity.如果你想匹配多个空格,为了清楚起见,你可以在这种情况下使用
[ ]+
和方括号。 If there can be more kinds of whitspaces except a newline you could use [^\S\r\n]*
instead.如果除了换行符之外还有更多种类的空格,您可以使用
[^\S\r\n]*
代替。
Based on multiple spaces, you might use基于多个空间,您可以使用
\b(?P<value>\d+(?:\.\d+)?)(?: *mm)?(?=(?: +x +\d+(?:\.\d+)?)* mm\b)
Regex demo |正则表达式演示| Python demo
Python演示
For example例如
import re
regex = r"\b(?P<value>\d+(?:\.\d+)?)(?: *mm)?(?=(?: +x +\d+(?:\.\d+)?)* mm\b)"
test_str = ("101.6 x 101.6 mm\n"
"150 x 150 mm\n"
"490 x 100 x 380 mm\n"
"490 x 100 x 380 x 430 mm\n"
"280mm x 260 mm")
print(re.findall(regex, test_str))
Output Output
['101.6', '101.6', '150', '150', '490', '100', '380', '490', '100', '380', '430', '280', '260']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.