在 Python 中使用正則表達式選擇不重復的所有排列

Question

我有三類字符，比如字母[A-Za-z] 、數字[0-9]和符號[!@#$] 。 就論證而言，特定的符號並不重要。 我想在 Python 中使用正則表達式，以便我可以選擇這三個類的所有排列，長度為 3，而無需重復。

例如，以下內容將成功匹配：

a1!
4B_
*x7

以下將失敗：

ab!
BBB
*x_
a1!B

如果沒有在正則表達式中明確寫出類的每個潛在排列，我將如何處理？

我以前嘗試過以下解決方案：

import re
regex = r"""
              ([A-Za-z]|[0-9]|[!@#$])
    (?!\1)    ([A-Za-z]|[0-9]|[!@#$])
    (?![\1\2])([A-Za-z]|[0-9]|[!@#$])
    """
s = "ab1"
re.fullmatch(regex, s, re.VERBOSE)

但是字符串ab1匹配不正確。 這是因為組引用\\1和\\2指的是組的實際匹配內容，而不是組中包含的正則表達式。

那么，如何引用包含在先前匹配組中的正則表達式，而不是它們的實際內容？

Answer 1

您的主要問題是您不能使用反向引用來否定模式的一部分，您只能使用它們來匹配/否定在相應捕獲組中捕獲的相同值。

注意[^\\1]匹配除\\x01字符之外的任何字符，而不是除 Group 1 所包含的字符之外的任何字符，因為在字符類中，反向引用不再如此。 ab1匹配，因為b不等於a並且1不等於a和1 。

您可以使用的是一系列否定前瞻，它們會在某些條件下“排除”匹配，例如字符串不能有兩個數字/字母/特殊字符。

rx = re.compile(r"""
  (?!(?:[\W\d_]*[^\W\d_]){2})      # no two letters allowed
  (?!(?:\D*\d){2})                 # no two digits allowed
  (?!(?:[^_!@\#$*]*[_!@\#$*]){2})  # no two special chars allowed
  [\w!@\#$*]{3}                    # three allowed chars
""", re.ASCII | re.VERBOSE)

請參閱正則表達式演示。 在演示中，否定字符類被替換為.* ，因為測試是針對單個多行文本而不是單獨的字符串執行的。

請參閱Python 演示：

import re
passes = ['a1!','4B_','*x7']
fails = ['ab!','BBB','*x_','a1!B']
rx = re.compile(r"""
  (?!(?:[\W\d_]*[^\W\d_]){2})      # no two letters allowed
  (?!(?:\D*\d){2})                 # no two digits allowed
  (?!(?:[^_!@\#$*]*[_!@\#$*]){2})  # no two special chars allowed
  [\w!@\#$*]{3}                    # three allowed chars
""", re.ASCII | re.VERBOSE)
for s in passes:
    print(s, ' should pass, result:', bool(rx.fullmatch(s)))
for s in fails:
    print(s, ' should fail, reuslt:', bool(rx.fullmatch(s)))

輸出：

a1!  should pass, result: True
4B_  should pass, result: True
*x7  should pass, result: True
ab!  should fail, reuslt: False
BBB  should fail, reuslt: False
*x_  should fail, reuslt: False
a1!B  should fail, reuslt: False

Answer 2

一個簡單的解決方案是不要自己寫出排列，而是讓 Python 在itertools的幫助下完成。

from itertools import permutations

patterns = [
    '[a-zA-Z]',
    '[0-9]',
    '[!@#$]'
]

regex = '|'.join(
    ''.join(p)
    for p in permutations(patterns)
)

在 Python 中使用正則表達式選擇不重復的所有排列

問題描述

2 個解決方案

解決方案1
1 2020-09-06 19:45:37

解決方案2
-1 2020-09-06 06:17:36

在 Python 中使用正則表達式選擇不重復的所有排列

問題描述

2 個解決方案

解決方案1 1 2020-09-06 19:45:37

解決方案2 -1 2020-09-06 06:17:36

解決方案1
1 2020-09-06 19:45:37

解決方案2
-1 2020-09-06 06:17:36