Python Regex：查找不重复的模式

Question

I want to find patterns in string as follows,我想在字符串中查找模式如下，

a = "3. ablkdna 08. 15. adbvnksd 4."

The expected patterns are like below,预期的模式如下所示，

match = "3. "
match = "4. "

I want to exclude the patterns,我想排除模式，

([0-9]+\.[\s]*){2,}

But only find the patterns of length 1. not 08. and 15. .但只能找到长度为 1. 的模式，而不是08.和15. .

How should I implement this?我应该如何实施？

Answer 1

The following regex will work for given two examples:以下正则表达式适用于给定的两个示例：

import re
p = re.compile(r'(?<!\d\.\s)(?<!\d)\d+\.(?!\s*\d+\.)')
a = "3. ablkdna 08. 15. adbvnksd 4."
m = re.findall(p, a)
print(m)
# prints  ['3.', '4.']

a = "3. (abc), adfb 8. 1. adfg 4. asdfasd"
m = re.findall(p, a)
print(m)
# prints  ['3.', '4.']

Apparently the regex above is not complete and there are many exceptions to allow "false-positive".显然上面的正则表达式不完整，并且有许多例外允许“误报”。

In order to write a complete regex which excludes an arbitrary pattern, we will need to implement the absent operator (?~exp) which was introduced in Ruby 2.4.1 and not available in Python as of now.为了编写一个排除任意模式的完整正则表达式，我们需要实现在 Ruby 2.4.1 中引入的缺失运算符(?~exp) ，目前在 Python 中不可用。

As an alternative, how about a two step solution:作为替代方案，两步解决方案如何：

m = re.findall(r'\d+\.\s*', re.sub(r'(\d+\.\s*){2,}', '', a))

which may not be elegant.这可能不优雅。

Python Regex：查找不重复的模式

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-11-05 03:58:01

Python Regex：查找不重复的模式

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-11-05 03:58:01

解决方案1
1 已采纳 2020-11-05 03:58:01