正则表达式非捕获可选

Question

我是新手，我已经坚持了几天。 我想在Python中使用提取到没有URL的普通句子。
例如：

1st text: '(some normal sentences...) https://www.(...)'  
2nd text: '(some normal sentences...) '

当我使用r'([\\w+\\.\\s\\W\\@w]+)(?:https)' ，它将仅在第一个文本中捕获句子。

当我使用r'([\\w+\\.\\s\\W\\@w]+)(?:https)?' 它将捕获第二个文本中的句子和第一个文本的所有文本。

有人可以帮助我的正则表达式吗？

Answer 1

你可以使用non greedy正则表达式，

>>> import re
>>> x
"1st text: '(some normal sentences...) https://www.(...)\n2nd text: '(some normal sentences...)"
>>> print(x)
1st text: '(some normal sentences...) https://www.(...)
2nd text: '(some normal sentences...)
>>> re.findall(r'\(\w.+?\)', x)
['(some normal sentences...)', '(some normal sentences...)']
>>> re.findall(r'\((\w.+?)\)', x)
['some normal sentences...', 'some normal sentences...']

正则表达式非捕获可选

问题描述

1 个解决方案

解决方案1
0 2019-03-20 14:46:47

正则表达式非捕获可选

问题描述

1 个解决方案

解决方案1 0 2019-03-20 14:46:47

解决方案1
0 2019-03-20 14:46:47