简体   繁体   中英

How to generate 5-character strings combinations (1 digit, two equal letters and two different equal letters) without duplication

I am trying to generate combinations of a 5-character strings consisting of four letters (exactly two are equal and another repeated two are equal) and one digit.

Example for CORRECT combinations:

1aabb  
b1aab  
ca3ac  

Example for INCORRECT combinations:

1aaaa  -> incorrect because there are more than 2 equal letters
1aaab  -> Same as the previous
1abcd  -> No 2 equal letters + 2 equal different letters  

This is the code I am using:

from itertools import combinations, permutations, product

LETTERS = 'bcdfghjklmnpqrstvwxz'
DIGITS = '2456789'

def aabb1(letters=LETTERS, digits=DIGITS):
    """Generate the distinct 5-character strings consisting of four
    letters (exactly two are equal and another repeat two are equal) and one digit.

    """
    combs = []
    for (a, b), (i, j), (x, y), d, k in product(
            permutations(letters, 2),   # Two letters (a repeated).
            combinations(range(4), 2),  # Positions for the repeated letter.
            combinations(range(2), 2),  # Positions for the second repeated letter.
            digits,                     # One digit.
            range(5)):                  # Positions for the digit.
        result = []
        result[i:i] = a,
        result[j:j] = a,
        result[x:x] = b,
        result[y:y] = b,
        result[k:k] = d,
        combs.append(''.join(result))

    print(len(combs))
    return combs

It prints that I have 79,800 combinations but this is incorrect because I am counting duplicated combinations:
在此处输入图片说明

The problem is because it chooses some letter, for example a to appear twice and then repeated letter, like f , to appear twice so we will get something like: a3faf but later it chooses the first letter as f and the second as a and get again a3faf .
In math I can solve it with dividing by 2:

在此处输入图片说明

But not sure how to do it properly in my code.

Can you suggest how I can prevent it in my code ? Meaning, get the combinations without duplication.

Change permutations(letters, 2) to combinations(letters, 2) . permutations() will deliver ('a', 'b') and ('b', 'a') , but combinations() will deliver just ('a', 'b') . Your combinations for letter positions takes care of all orderings of those letters so you don't need to see them twice.

Edit: In addition to the previous fix, calculating the positions of the second letter based on the first letter finally fixes it. So if 'a' is at index 0 and 2 then 'b' must be at index 1 and 4 .

def aabb1(letters=LETTERS, digits=DIGITS):
    """Generate the distinct 5-character strings consisting of four
    letters (exactly two are equal and another repeat two are equal) and one digit.

    """
    letterdxs = set(range(4))
    combs = []
    for (a, b), (i, j), d, k in product(
            combinations(letters, 2),   # Two letters (a repeated).
            combinations(range(4), 2),  # Positions for the 1st repeated letter.
            digits,                     # One digit.
            range(5)):                  # Positions for the digit.
        x, y = letterdxs.difference((i, j))
        result = []
        result[i:i] = a,
        result[j:j] = a,
        result[x:x] = b,
        result[y:y] = b,
        result[k:k] = d,
        combs.append(''.join(result))
    print(len(combs))
    return combs

You can write a recursive function:

#1aabb  
#b1aab  
#ca3ac  
from collections import Counter
LETTERS = 'bcdfghjklmnpqrstvwxz'
DIGITS = '2456789'
def combinations(d, current = []):
   if len(current) == 5:
      yield ''.join(current)
   else:
      for i in d:
        _d = Counter(current)
        if i.isdigit() and not any(c.isdigit() for c in current):
          yield from combinations(d, current+[i])
        elif (not current or _d.get(i, 0) == 1 or sum(c.isalpha() for c in current) < 2) and i.isalpha():
          yield from combinations(d, current+[i])

result = list(combinations(LETTERS+DIGITS))

Output (first 100 results):

['bcbc2', 'bcbc4', 'bcbc5', 'bcbc6', 'bcbc7', 'bcbc8', 'bcbc9', 'bcb2c', 'bcb4c', 'bcb5c', 'bcb6c', 'bcb7c', 'bcb8c', 'bcb9c', 'bccb2', 'bccb4', 'bccb5', 'bccb6', 'bccb7', 'bccb8', 'bccb9', 'bcc2b', 'bcc4b', 'bcc5b', 'bcc6b', 'bcc7b', 'bcc8b', 'bcc9b', 'bc2bc', 'bc2cb', 'bc4bc', 'bc4cb', 'bc5bc', 'bc5cb', 'bc6bc', 'bc6cb', 'bc7bc', 'bc7cb', 'bc8bc', 'bc8cb', 'bc9bc', 'bc9cb', 'bdbd2', 'bdbd4', 'bdbd5', 'bdbd6', 'bdbd7', 'bdbd8', 'bdbd9', 'bdb2d', 'bdb4d', 'bdb5d', 'bdb6d', 'bdb7d', 'bdb8d', 'bdb9d', 'bddb2', 'bddb4', 'bddb5', 'bddb6', 'bddb7', 'bddb8', 'bddb9', 'bdd2b', 'bdd4b', 'bdd5b', 'bdd6b', 'bdd7b', 'bdd8b', 'bdd9b', 'bd2bd', 'bd2db', 'bd4bd', 'bd4db', 'bd5bd', 'bd5db', 'bd6bd', 'bd6db', 'bd7bd', 'bd7db', 'bd8bd', 'bd8db', 'bd9bd', 'bd9db', 'bfbf2', 'bfbf4', 'bfbf5', 'bfbf6', 'bfbf7', 'bfbf8', 'bfbf9', 'bfb2f', 'bfb4f', 'bfb5f', 'bfb6f', 'bfb7f', 'bfb8f', 'bfb9f', 'bffb2', 'bffb4']

For fixed length and format this straightforward code generates 39900 combinations:

LETTERS = 'bcdfghjklmnpqrstvwxz'
DIGITS = '2456789'

def insdig(s, d):
    for i in range(5):
        ss = s[:i] + d + s[i:]
        print(ss)

def aabb1():
    for dig in DIGITS:
        for i in range(len(LETTERS)-1):
            for j in range(i+1, len(LETTERS)):
                a = LETTERS[i]
                b = LETTERS[j]
                insdig(a+a+b+b, dig)
                insdig(a+b+a+b, dig)
                insdig(b+a+a+b, dig)
                insdig(a+b+b+a, dig)
                insdig(b+a+b+a, dig)
                insdig(b+b+a+a, dig)

aabb1()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM