Python通過從正則表達式模式拆分來創建字符串元組列表

Question

假設我有這兩個字符串：

s1 = 'hello 4, this is stackoverflow, looking for help (1345-today is wednesday)'
s2 = 'hello again, this is a (bit-more complicated), string (67890123 - tomorrow is thursday)'

我想使用正則表達式來匹配模式(number-words) ，然后拆分字符串以獲取元組列表：

final = [('hello 4, this is stackoverflow, looking for help', '1345-today is wednesday'),
         ('hello again, this is a (bit-more complicated), string', '67890123 - tomorrow is thursday')]

我試過\\([0-9]+-(.*?)\\)但沒有成功。

我究竟做錯了什么？ 有什么想法可以解決嗎？

先感謝您！！

Answer 1

這可能會推動您朝着正確的方向前進：

>>> re.findall(r'^(.*) \((.+?)\)$', s1)
[('hello 4, this is stackoverflow, looking for help', '1345-today is wednesday')]

Answer 2

您可以在findall使用此正則表達式：

>>> regx = re.compile(r'^(.*?)\s*\((\d+\s*-\s*\w+[^)]*)\)')
>>> arr = ['hello 4, this is stackoverflow, looking for help (1345-today is wednesday)', 'hello again, this is a (bit-more complicated), string (67890123 - tomorrow is thursday)']
>>> for el in arr:
...     regx.findall(el)
...
[('hello 4, this is stackoverflow, looking for help', '1345-today is wednesday')]
[('hello again, this is a (bit-more complicated), string', '67890123 - tomorrow is thursday')]

正則表達式詳情：

^(.*?) : 匹配第 1 組開頭的 0 個或多個字符
\\s* : 匹配 0 個或多個空格
\\((\\d+\\s*-\\s*\\w+[^)]*)\\) ：匹配(<number>-word ..)字符串並捕獲捕獲組 #2 中括號內的內容

或者，您可以在split使用此正則表達式：

>>> import re
>>> reg = re.compile(r'(?<!\s)\s*(?=\((\d+\s*-\s*\w+[^)]*)\))')
>>> for el in arr:
...     reg.split(el)[:-1]
...
['hello 4, this is stackoverflow, looking for help', '1345-today is wednesday']
['hello again, this is a (bit-more complicated), string', '67890123 - tomorrow is thursday']

正則表達式演示

正則表達式詳情：

(?<!\\s) : 如果我們之前的位置沒有空格
\\s* : 匹配 0+ 個空格
(?=\\((\\d+\\s*-\\s*\\w+[^)]*)\\)) ：先行聲明我們前面的一個字符串，即(<number>-word ..) 。 請注意，我們使用捕獲組在split的結果中獲取(...)內的字符串。

Python通過從正則表達式模式拆分來創建字符串元組列表

問題描述

2 個解決方案

解決方案1
0 2020-10-28 16:06:47

解決方案2
0 已采納 2020-10-28 16:14:10

Python通過從正則表達式模式拆分來創建字符串元組列表

問題描述

2 個解決方案

解決方案1 0 2020-10-28 16:06:47

解決方案2 0 已采納 2020-10-28 16:14:10

解決方案1
0 2020-10-28 16:06:47

解決方案2
0 已采納 2020-10-28 16:14:10