[英]Python create list of tuples of strings by splitting from regex pattern
假设我有这两个字符串:
s1 = 'hello 4, this is stackoverflow, looking for help (1345-today is wednesday)'
s2 = 'hello again, this is a (bit-more complicated), string (67890123 - tomorrow is thursday)'
我想使用正则表达式来匹配模式(number-words)
,然后拆分字符串以获取元组列表:
final = [('hello 4, this is stackoverflow, looking for help', '1345-today is wednesday'),
('hello again, this is a (bit-more complicated), string', '67890123 - tomorrow is thursday')]
我试过\\([0-9]+-(.*?)\\)
但没有成功。
我究竟做错了什么? 有什么想法可以解决吗?
先感谢您!!
这可能会推动您朝着正确的方向前进:
>>> re.findall(r'^(.*) \((.+?)\)$', s1)
[('hello 4, this is stackoverflow, looking for help', '1345-today is wednesday')]
您可以在findall
使用此正则表达式:
>>> regx = re.compile(r'^(.*?)\s*\((\d+\s*-\s*\w+[^)]*)\)')
>>> arr = ['hello 4, this is stackoverflow, looking for help (1345-today is wednesday)', 'hello again, this is a (bit-more complicated), string (67890123 - tomorrow is thursday)']
>>> for el in arr:
... regx.findall(el)
...
[('hello 4, this is stackoverflow, looking for help', '1345-today is wednesday')]
[('hello again, this is a (bit-more complicated), string', '67890123 - tomorrow is thursday')]
正则表达式详情:
^(.*?)
: 匹配第 1 组开头的 0 个或多个字符\\s*
: 匹配 0 个或多个空格\\((\\d+\\s*-\\s*\\w+[^)]*)\\)
:匹配(<number>-word ..)
字符串并捕获捕获组 #2 中括号内的内容或者,您可以在split
使用此正则表达式:
>>> import re
>>> reg = re.compile(r'(?<!\s)\s*(?=\((\d+\s*-\s*\w+[^)]*)\))')
>>> for el in arr:
... reg.split(el)[:-1]
...
['hello 4, this is stackoverflow, looking for help', '1345-today is wednesday']
['hello again, this is a (bit-more complicated), string', '67890123 - tomorrow is thursday']
正则表达式详情:
(?<!\\s)
: 如果我们之前的位置没有空格\\s*
: 匹配 0+ 个空格(?=\\((\\d+\\s*-\\s*\\w+[^)]*)\\))
:先行声明我们前面的一个字符串,即(<number>-word ..)
。 请注意,我们使用捕获组在split
的结果中获取(...)
内的字符串。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.