简体   繁体   English

正则表达式匹配数字组后跟或不跟空格,单词

[英]Regex match groups of digits followed or not by spaces, words

I'm trying to match with regex:我正在尝试匹配正则表达式:

101.6 x 101.6 mm
150   x      150 mm
490 x 100 x 380 mm
490 x 100 x 380 x 430 mm
280mm x 260 mm

and extract the value(digits) as separate groups.并将值(数字)提取为单独的组。 I'm using:我在用着:

^(?P<value>[-\.\d]+)([\s]*)([x]+) 

but, I want something that doesn't care about how many times the digits appear.但是,我想要一些不关心数字出现多少次的东西。

What I want to obtain as groups:我想作为团体获得什么:

101.6, 101.6, mm
150, 150, mm
490, 100, 380, mm
490, 100, 380, 430, mm
280, 260, mm

I know, can be done with split as it is, but besides the examples above, I have also other expressions that contains "x" and in there case I don't want to split.我知道,可以按原样拆分,但除了上面的示例之外,我还有其他包含“x”的表达式,在这种情况下我不想拆分。

Given that all the strings in the example data end with mm and mm might also optionally occur after a digit, you could match an optional occurrence and use a positive lookahead to assert that the string also ends with mm and that what comes in between are to only allowed parts.鉴于示例数据中所有以mm结尾的字符串和mm也可能选择性地出现在数字之后,您可以匹配一个可选的出现并使用正向先行断言字符串也以mm结尾并且介于两者之间的是只允许部分。

If you want to match multiple spaces, you could use [ ]+ with the brackets in this case for clarity.如果你想匹配多个空格,为了清楚起见,你可以在这种情况下使用[ ]+和方括号。 If there can be more kinds of whitspaces except a newline you could use [^\S\r\n]* instead.如果除了换行符之外还有更多种类的空格,您可以使用[^\S\r\n]*代替。

Based on multiple spaces, you might use基于多个空间,您可以使用

\b(?P<value>\d+(?:\.\d+)?)(?: *mm)?(?=(?: +x +\d+(?:\.\d+)?)* mm\b)

Regex demo |正则表达式演示| Python demo Python演示

For example例如

import re

regex = r"\b(?P<value>\d+(?:\.\d+)?)(?: *mm)?(?=(?: +x +\d+(?:\.\d+)?)* mm\b)"

test_str = ("101.6 x 101.6 mm\n"
    "150   x      150 mm\n"
    "490 x 100 x 380 mm\n"
    "490 x 100 x 380 x 430 mm\n"
    "280mm x 260 mm")

print(re.findall(regex, test_str))

Output Output

['101.6', '101.6', '150', '150', '490', '100', '380', '490', '100', '380', '430', '280', '260']

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 正则表达式匹配字符串,其中单词后跟空格,然后是数字点或连字符,单词后跟空格,然后(一些信息) - Regex to match string which has words followed by whitespace then digits dot or hyphen and words followed by space and then (some info) 正则表达式匹配后跟空格或标点符号的单词 - Regex to match words followed by whitespace or punctuation Python正则表达式:获取所有非数字,分组,后跟或不跟一个空格 - Python regex: get all the non digits, in groups, followed or not by a space Python Regex:匹配前面或后面没有带数字的单词的字符串 - Python Regex: Match a string not preceded by or followed by a word with digits in it Python正则表达式匹配任意数量的数字,而不是紧跟句点 - Python regex match any number of digits not immediately followed by period 正则表达式匹配单词之间有空格的单词 - regex match for words with spaces in between them 使用 python regex 删除除单词、数字和空格之外的所有内容 - Removing everything except words, digits and spaces using python regex 正则表达式匹配不由 3 个相同数字组成的重复 3 位数字组 - Regex to match repeating 3 digit groups that arent made up of 3 identical digits 正则表达式匹配单词,如果可选后跟任何单词,除非后跟某些单词 - Regex match word if optionally followed by any word unless followed by certain words 匹配特定数量的数字,该数字不得在数字之前或之后 - Match a specific number of digits not preceded or followed by digits
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM