Python通过从正则表达式模式拆分来创建字符串元组列表

Question

假设我有这两个字符串：

s1 = 'hello 4, this is stackoverflow, looking for help (1345-today is wednesday)'
s2 = 'hello again, this is a (bit-more complicated), string (67890123 - tomorrow is thursday)'

我想使用正则表达式来匹配模式(number-words) ，然后拆分字符串以获取元组列表：

final = [('hello 4, this is stackoverflow, looking for help', '1345-today is wednesday'),
         ('hello again, this is a (bit-more complicated), string', '67890123 - tomorrow is thursday')]

我试过\\([0-9]+-(.*?)\\)但没有成功。

我究竟做错了什么？ 有什么想法可以解决吗？

先感谢您！！

Answer 1

这可能会推动您朝着正确的方向前进：

>>> re.findall(r'^(.*) \((.+?)\)$', s1)
[('hello 4, this is stackoverflow, looking for help', '1345-today is wednesday')]

Answer 2

您可以在findall使用此正则表达式：

>>> regx = re.compile(r'^(.*?)\s*\((\d+\s*-\s*\w+[^)]*)\)')
>>> arr = ['hello 4, this is stackoverflow, looking for help (1345-today is wednesday)', 'hello again, this is a (bit-more complicated), string (67890123 - tomorrow is thursday)']
>>> for el in arr:
...     regx.findall(el)
...
[('hello 4, this is stackoverflow, looking for help', '1345-today is wednesday')]
[('hello again, this is a (bit-more complicated), string', '67890123 - tomorrow is thursday')]

正则表达式详情：

^(.*?) : 匹配第 1 组开头的 0 个或多个字符
\\s* : 匹配 0 个或多个空格
\\((\\d+\\s*-\\s*\\w+[^)]*)\\) ：匹配(<number>-word ..)字符串并捕获捕获组 #2 中括号内的内容

或者，您可以在split使用此正则表达式：

>>> import re
>>> reg = re.compile(r'(?<!\s)\s*(?=\((\d+\s*-\s*\w+[^)]*)\))')
>>> for el in arr:
...     reg.split(el)[:-1]
...
['hello 4, this is stackoverflow, looking for help', '1345-today is wednesday']
['hello again, this is a (bit-more complicated), string', '67890123 - tomorrow is thursday']

正则表达式演示

正则表达式详情：

(?<!\\s) : 如果我们之前的位置没有空格
\\s* : 匹配 0+ 个空格
(?=\\((\\d+\\s*-\\s*\\w+[^)]*)\\)) ：先行声明我们前面的一个字符串，即(<number>-word ..) 。 请注意，我们使用捕获组在split的结果中获取(...)内的字符串。

Python通过从正则表达式模式拆分来创建字符串元组列表

问题描述

2 个解决方案

解决方案1
0 2020-10-28 16:06:47

解决方案2
0 已采纳 2020-10-28 16:14:10

Python通过从正则表达式模式拆分来创建字符串元组列表

问题描述

2 个解决方案

解决方案1 0 2020-10-28 16:06:47

解决方案2 0 已采纳 2020-10-28 16:14:10

解决方案1
0 2020-10-28 16:06:47

解决方案2
0 已采纳 2020-10-28 16:14:10