Python 正則表達式：提取字符串中所有出現的 substring

Question

我正在嘗試使用 Python Regex 在字符串中提取所有出現的 substring。 這是我嘗試過的：

import re
line = "The dimensions of the first rectangle: 10'x20', second rectangle: 10x35cm, third rectangle: 30x35cm"
m = re.findall(r'\d+x.*?[a-zA-Z]', line)
print (m)

我得到的 output 是['10x35c', '30x35c']

我想要實現的 output 是['10'x20'', '10x35cm', '30x35cm']

Answer 1

您可以在不使用regex的情況下使用split執行此操作：

In [1089]: m = [i.split(':')[1].strip() for i in line.split(',')]

In [1090]: m
Out[1090]: ["10'x20'", '10x35cm', '30x35cm']

Answer 2

您可以使用此正則表達式：

r"\d+['\"]?x\d+['\"]?(?:\s*[a-zA-Z]+)?"

正則表達式演示

代碼：

>>> import re
>>> line = "The dimensions of the first rectangle: 10'x20', second rectangle: 10x35cm, third rectangle: 30x35cm"
>>> print (re.findall(r"\d+['\"]?x\d+['\"]?(?:\s*[a-zA-Z]+)?", line))
["10'x20'", '10x35cm', '30x35cm']

正則表達式詳細信息：

\d+ ：匹配 1+ 個數字
['\"]? : 匹配可選'或"
x : 匹配字母x
\d+ ：匹配 1+ 個數字
['\"]? : 匹配可選'或"
(?:\s*[a-zA-Z]+)? : 匹配包含 1+ 個字母的可選單元

Answer 3

利用

import re
string = "The dimensions of the first rectangle: 10'x20', second rectangle: 10x35cm, third rectangle: 30x35cm"
print(re.findall(r"""\d+'?x\d+'?(?: *[a-z]+)?""", string, re.I))

結果： ["10'x20'", '10x35cm', '30x35cm']

參見Python 證明。 re.I代表不區分大小寫的匹配。

說明：

--------------------------------------------------------------------------------
  \d+                      digits (0-9) (1 or more times (matching
                           the most amount possible))
--------------------------------------------------------------------------------
  '?                       '\'' (optional (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  x                        'x'
--------------------------------------------------------------------------------
  \d+                      digits (0-9) (1 or more times (matching
                           the most amount possible))
--------------------------------------------------------------------------------
  '?                       '\'' (optional (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  (?:                      group, but do not capture (optional
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
     *                       ' ' (0 or more times (matching the most
                             amount possible))
--------------------------------------------------------------------------------
    [a-z]+                   any character of: 'a' to 'z' (1 or more
                             times (matching the most amount possible))
--------------------------------------------------------------------------------
  )?                       end of grouping

Python 正則表達式：提取字符串中所有出現的 substring

問題描述

3 個解決方案

解決方案1
1 2021-01-12 17:00:26

解決方案2
1 已采納 2021-01-12 17:03:13

解決方案3
0 2021-01-12 22:32:43

Python 正則表達式：提取字符串中所有出現的 substring

問題描述

3 個解決方案

解決方案1 1 2021-01-12 17:00:26

解決方案2 1 已采納 2021-01-12 17:03:13

解決方案3 0 2021-01-12 22:32:43

解決方案1
1 2021-01-12 17:00:26

解決方案2
1 已采納 2021-01-12 17:03:13

解決方案3
0 2021-01-12 22:32:43