简体   繁体   中英

Generate a list of strings in python

I'm pretty new to python, so I thought I'd give it a shot for this quick script...

Given a set of input words: ie "dead", "beef", how would one programmatically generate all possible strings of a given length and padding character?

The result would look something like this (using a pad of 0 and length of 12):

deadbeef0000
dead0beef000
dead00beef00
dead000beef0
dead0000beef
0deadbeef000
0dead0beef00
0dead00beef0
0dead000beef
00deadbeef00
00dead0beef0
00dead00beef
000deadbeef0
000dead0beef
0000deadbeef

The naive method to generate this list might be:

for x in range(0, 5):
  pre = '0' * x
  for y in range(0, 5):
    mid = '0' * y
    for z in range (0, 5):
      post = '0' * z
      val = pre + 'dead' + mid + 'beef' + post
      if len(val) == 12:
        print val

Is there a more straightforward approach? I've looked into itertools for combinations, but can't get it to produce the desired result.

maybe something like:

>>> from itertools import permutations
>>> lst = ['dead', 'beef', '0', '0', '0', '0']
>>> for i in set(permutations(lst)):
...     ''.join(i)
... 
'beefdead0000'
'dead0beef000'
'dead000beef0'
'00dead0beef0'
'0beef000dead'
...

EDIT: following @TimPeters comment

>>> [''.join(i) for i in set(permutations(lst)) 
                                        if i.index('dead') < i.index('beef')]

or in a more general way:

>>> real_words = ['dead', 'beef']
>>> padding = 4
>>> [''.join(i) for i in set(permutations(real_words + ['0'] * padding)) 
                      if reduce(lambda x,y: i.index(x) < i.index(y), real_words)]

Assuming that, as your example showed, you want the "real words" to appear in their original order, this can be done directly (each desired output string generated exactly once, with no duplicates and no need to "weed out" extraneous strings). You need one "slot" for each word, and one slot for each pad character, where the total number of pad characters is the specified final length minus the sum of the lengths of the words. In your example, you have two words of length 4, for a total word length of 8, so you need 12-8 = 4 pad characters in each output string. This gives a total of 6 slots (2 for words and 4 for pad characters). The total number of output strings is thus 6-choose-2 = 6-choose-4 = 6*5/2 = 15. That also explains how to use itertools.combinations to get the results: you pick the indices for the 2 words from the set of all slot indices, or - equivalently - pick the indices for the 4 pad characters from the set of all slot indices. The code here does the former:

def get_strings(words, pad, length):
    from itertools import combinations
    nwords = len(words)
    npad = length - sum(len(word) for word in words)
    nslots = nwords + npad
    for ix in combinations(range(nslots), nwords):
        result = [pad] * nslots
        i = 0
        for j in ix:
            result[j] = words[i]
            i += 1
        yield "".join(result)

Then:

for s in get_strings(["dead", "beef"], "0", 12):
    print s

displays the 15 strings you wanted.

for x in range(0, 5):
    pre = '0' * x
    for y in range(0, 5-x):
        mid = '0' * y
        z = 5 - x - y
        post = '0' * z
        val = pre + 'dead' + mid + 'beef' + post
        print val

I'm assuming that in your example, you want dead to appear before beef always. If so, I believe the below code works.

Basically, what it does is it generates all partitions of the number of padding characters necessary that have a number of compartments equal to one more than the number of strings you pass in. This solution uses generators, so that strings are generated on the fly (ie lazily). Note that this will not work correctly unless len(padding_character) == 1 .

import itertools

def partitions(size, number):
    '''
Yields all partitions of `size` containing `number` compartments, where
a compartment may be 0, and where all permutations are returned
'''
    if number == 1:
        yield (size,)
    else:
        for i in range(0, size+1):
            for partition in partitions(size-i, number-1):
                yield (i,) + partition

def padding_generator(strings, padding_character, total_length):
    strings_length = sum(map(len, strings))
    padding_length = total_length - strings_length
    num_padding_locations = len(strings)+1
    for partition in partitions(padding_length, num_padding_locations):
        result = ''
        for padding_count, string in itertools.zip_longest(partition, strings, fillvalue=''):
            result += padding_character*padding_count
            result += string
        yield result

for string in padding_generator(['dead', 'beef'], '0', 12):
    print(string)

Results:

deadbeef0000
dead0beef000
dead00beef00
dead000beef0
dead0000beef
0deadbeef000
0dead0beef00
0dead00beef0
0dead000beef
00deadbeef00
00dead0beef0
00dead00beef
000deadbeef0
000dead0beef
0000deadbeef

Just for fun, a direct recursive approach:

def padded(to_pad, pad_with, pad_amount):
    if not to_pad:
        yield pad_with * pad_amount
        return
    for i in range(pad_amount + 1):
        for suffix in padded(to_pad[1:], pad_with, pad_amount - i):
            yield pad_with * i + to_pad[0] + suffix

Testing:

>>> list(padded(['dead', 'beef'], '0', 4))
['deadbeef0000', 'dead0beef000', 'dead00beef00', 'dead000beef0', 'dead0000beef',
 '0deadbeef000', '0dead0beef00', '0dead00beef0', '0dead000beef', '00deadbeef00',
 '00dead0beef0', '00dead00beef', '000deadbeef0', '000dead0beef', '0000deadbeef']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM