[英]Python Regex: Extract all occurences of a substring within a string
我正在嘗試使用 Python Regex 在字符串中提取所有出現的 substring。 這是我嘗試過的:
import re
line = "The dimensions of the first rectangle: 10'x20', second rectangle: 10x35cm, third rectangle: 30x35cm"
m = re.findall(r'\d+x.*?[a-zA-Z]', line)
print (m)
我得到的 output 是['10x35c', '30x35c']
我想要實現的 output 是['10'x20'', '10x35cm', '30x35cm']
您可以在不使用regex
的情況下使用split
執行此操作:
In [1089]: m = [i.split(':')[1].strip() for i in line.split(',')]
In [1090]: m
Out[1090]: ["10'x20'", '10x35cm', '30x35cm']
您可以使用此正則表達式:
r"\d+['\"]?x\d+['\"]?(?:\s*[a-zA-Z]+)?"
代碼:
>>> import re
>>> line = "The dimensions of the first rectangle: 10'x20', second rectangle: 10x35cm, third rectangle: 30x35cm"
>>> print (re.findall(r"\d+['\"]?x\d+['\"]?(?:\s*[a-zA-Z]+)?", line))
["10'x20'", '10x35cm', '30x35cm']
正則表達式詳細信息:
\d+
:匹配 1+ 個數字['\"]?
: 匹配可選'
或"
x
: 匹配字母x
\d+
:匹配 1+ 個數字['\"]?
: 匹配可選'
或"
(?:\s*[a-zA-Z]+)?
: 匹配包含 1+ 個字母的可選單元利用
import re
string = "The dimensions of the first rectangle: 10'x20', second rectangle: 10x35cm, third rectangle: 30x35cm"
print(re.findall(r"""\d+'?x\d+'?(?: *[a-z]+)?""", string, re.I))
結果: ["10'x20'", '10x35cm', '30x35cm']
參見Python 證明。 re.I
代表不區分大小寫的匹配。
說明:
--------------------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
'? '\'' (optional (matching the most amount
possible))
--------------------------------------------------------------------------------
x 'x'
--------------------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
'? '\'' (optional (matching the most amount
possible))
--------------------------------------------------------------------------------
(?: group, but do not capture (optional
(matching the most amount possible)):
--------------------------------------------------------------------------------
* ' ' (0 or more times (matching the most
amount possible))
--------------------------------------------------------------------------------
[a-z]+ any character of: 'a' to 'z' (1 or more
times (matching the most amount possible))
--------------------------------------------------------------------------------
)? end of grouping
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.